This invention is directed to computer-assisted diagnostics and classification of cancer. Specifically, the invention is directed to in-vivo segmentation of MRI, cancer detection using multi-functional MRI (including T2-w, T1-w, Dynamic contrast enhanced, and Magnetic Resonance Spectroscopy (MRS); and their integration for a computer-aided diagnosis and classification of cancer such as prostate cancer.
Prostatic adenocarcinoma (CAP) is the second leading cause of cancer related deaths in America, with an estimated 186,000 new cases every year (Source: American Cancer Society). Detection and surgical treatment of the early stages of tissue malignancy are usually curative. In contrast, diagnosis and treatment in late stages often have deadly results. Likewise, proper classification of the various stages of the cancer's progression is imperative for efficient and effective treatment. The current standard for detection of prostate cancer is transrectal ultrasound (TRUS) guided symmetrical needle biopsy which has a high false negative rate associated with it
Over the past few years, Magnetic Resonance Spectroscopic Imaging (MRSI) has emerged as a useful complement to structural MR imaging for potential screening of prostate cancer Magnetic Resonance Spectroscopy (MRS) along with MRI has emerged as a promising tool in diagnosis and potentially screening for prostate cancer. The major problems in prostate cancer detection lie in lack of specificity that MRI alone has in detecting cancer locations and sampling errors associated with systemic biopsies. While MRI provides information about the structure of the gland, MRS provides metabolic functional information about the biochemical markers of the disease. These techniques offer a non-invasive alternative to trans-rectal ultrasound biopsy procedures.
In view of the above, there is a need in the field for a reliable method for increasing specificity and sensitivity in the detection and classification of prostate cancer.
In one embodiment, the invention provides an unsupervised method of identification of an organ cancer from a spectral dataset of magnetic resonance spectroscopy (MRS), comprising the steps of: embedding the spectral data of an initial two hundred and fifty-six dimensions in a low dimensional space; applying hierarchical unsupervised k-means clustering to distinguish a non-informative from an informative spectra in the embedded space; pruning objects in the dominant cluster, whereby pruned objects are eliminated from subsequent analysis; and identifying sub-clusters corresponding to cancer.
In another embodiment, the invention provides an unsupervised method of segmenting regions on an in-vivo tissue (T1-w or T2-w or DCE) MRI, comprising the steps of: obtaining a three-dimensional T1-w or T2-w or DCE MR dataset; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected T1-w or T2-w MR scene; extracting image features from the T1-w or T2-w MR scene; embedding the extracted image features or inherent kinetic features or a combination thereof into a low dimensional space, thereby reducing the dimensionality of the image feature space; clustering the embedded space into a number of predetermined classes, wherein the clustering is achieved by partitioning the features in the embedded space to disjointed regions, thereby creating classes and therefore segmenting the embedded space.
In another embodiment, the invention provides a system for performing unsupervised classification of a prostate image dataset comprising the steps of: obtaining a magnetic resonance spectroscopy (MRS) dataset, said dataset defining a scene using MR spectral data, identifying cancer sub-clusters via use of spectral data; obtaining a T1-w or T2-w or DCE MR image scene of the said dataset; segmenting the MR image into cancer classes; integrating the magnetic resonance spectra dataset and the magnetic resonance imaging dataset (T1-w or T2-w or DCE), thereby redefining the scene; the system for analysis of the redefined scene comprising a module for identifying cancer sub-clusters from integrated spectral and image data; a manifold learning module; and a visualization module.
In one embodiment, the invention provides a method of auomatically segmenting a boundary on an T1-w or T2-w MRI image, comprising the steps of: obtaining a training MRI dataset; using expert selected landmarks on the training MRI data, obtaining a statistical shape model; using a feature extraction method on training MRI data, obtaining a statistical appearance model; using automated hierarchical spectral clustering of the embedded space to a predetermined class on a corresponding MRS dataset, obtaining region of interest (ROI); using the region of interest, the statistical shape model and statistical appearance model, initialize a segmentation method of the boundary on an MRI image, and hence automatically determine the boundary.
In one embodiment, the invention provides a method of unsupervised determination of pair-wise slice correspondences between a histology database and MRI via group-wise mutual information, comprising the steps of: obtaining a histology and an MRI dataset; automatically segmenting the boundary on a MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; extracting image features from the pre-processed MR scene; determining similarity metrics between the extracted image features of the MRI dataset and histology dataset to find optimal correspondences; whereby the MRI dataset is a T1-w or T2-w MRI volume and the histology database are whole mount histological sections (WMHS).
In one embodiment, the invention provides a method of supervised classification for automated cancer detection using Magnetic Resonance Spectroscopy (MRS), comprising the steps of: obtaining a multimodal dataset comprising MRI images, MRS spectra and a histology database, automatically segmenting the boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; using data manipulation methods, pre-processing the MRS data; from corresponding MRI and histology regions obtained in the step of registering the correspondence of whole mount histological sections (WMHS) and MRI slices determining cancer and normal MRS spectra; determining similarities between input test spectra with cancer and normal spectra, and classifying MRS data as cancerous or normal; whereby the MRI image, MRS spectra and histology database is T1-w or T2-w MRI volume, MRS volume, whole mount histological sections (WMHS) data respectively.
In one embodiment, the invention provides a method of classification for automated prostate cancer detection using T2-weighted Magnetic Resonance Imaging (T2-w MRI) comprising the steps of: obtaining a multimodal dataset comprising MRI image, histology database; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the correspondence of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; extracting image features from the pre-processed MR scene; analyzing feature representations of MRI dataset; and classifying the MRI data with training data obtained in the step of registration as cancerous or normal; whereby the MRI image and histology database is T2-w MRI volume and whole mount histological sections (WMHS) data respectively.
In one embodiment, the invention provides a method of supervised classification for automated prostate cancer detection using Dynamic Contrast Enhanced Magnetic Resonance Imaging (DCE MRI), comprising the steps of: obtaining a DCE MRI multi-time point volume dataset; and WMHS dataset; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; extracting image features from the pre-processed MR scene; analyzing feature representations of MRI dataset; and classifying the MRI data with training data obtained in the step of registration as cancerous or normal; whereby the MRI image and histology database is DCE MRI multi-time point volume and whole mount histological sections (WMHS) data respectively.
In one embodiment, the invention provides a method of supervised classification for automated cancer detection using integration of MRS and MRI, comprising the steps of: obtaining a multimodal dataset comprising MRI image, MRS spectra and histology database; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MRI; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; processing the MRS dataset; determining similarities between input MRS test spectra with cancer and normal MRS spectra; extracting image or inherent kinetic features from the pre-processed MR scene; analyzing feature representations of MRI dataset; and classifying the sets of extracted spectral similarities and image feature representation data with training data obtained in the step of registration to classify MR data as cancerous or normal; whereby the MRI image, MRS spectra and histology database is T1-w or T2-w or DCE MRI volume, MRS volume, whole mount histological sections (WMHS) data respectively.
In one embodiment, the invention provides a method of supervised classification for automated prostate cancer detection using integration of DCE MRI and T2-w MRI, comprising the steps of: obtaining a multimodal dataset comprising DCE MRI image, T2-w MRI image and histology database; automatically segmenting the prostate boundary on (DCE and T2-w) MRI images; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating corrected (DCE and T2-w) MR scenes and pre-processing the segmented (DCE and T2-w) MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed (DCE and T2-w) MR images; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on (DCE and T2-w) MRI; extracting image features or inherent kinetic features from the pre-processed (DCE and T2-w) MR scenes; analyzing feature representations of (DCE and T2-w) MRI datasets; and classifying the sets of extracted image features and feature representation data from (DCE and T2-w) MR data with training data obtained in the step of registration to classify MR data as cancerous or normal; whereby the DCE MRI image, T2-w MRI image and histology database is DCE multi-time point volume, T2-w MRI volume and whole mount histological sections (WMHS) data respectively.
In one embodiment, the invention provides a method of supervised classification for automated prostate cancer detection using integration of MRS, DCE MRI and T2-w MRI, comprising the steps of: obtaining a multimodal dataset comprising (DCE and T2-w) MRI images, MRS spectra and histology database; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI and so also determining cancer and normal MRS spectra; pre-processing the MRS dataset; determining similarities between input MRS test spectra and cancer and normal MRS spectra; extracting image features from the pre-processed (DCE and T2-w) MR scenes; analyzing the image feature representations of MRI dataset; and classifying the sets of spectral similarities, extracted image features representation data with training data obtained in the step of registration to classify multimodal MR data in combination or as individual modalities as cancerous or normal; whereby the multimodal dataset is MRS volume, T2-w MRI volume, DCE multi-time point MRI volume, and the histology database is whole mount histological sections (WMHS) data.
In one embodiment, the invention provides a method of supervised classification for automated detection of different grades of prostate cancer using mutimodal datasets comprising the steps of: obtaining a multimodal dataset comprising (DCE and T2-w) MRI images, MRS spectra and histology database; automatically segmenting the prostate boundary on MRI images; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the correspondence of WMHS and MRI slices determined in to obtain regions of different Gleason grades on MRI; processing the MRS dataset; determining the MRS spectra corresponding to different Gleason grade MRI regions obtained in the step of registration; determining similarities between input test spectra with different Gleason grade MRS spectra obtained in the step of correspondences; extracting image features or inherent kinetic features from the pre-processed (DCE and T2-w) MR scenes; analyzing the image feature representations of (DCE and T2-w) MRI datasets; and classifying the sets of spectral similarities and extracted image feature representation data with training data obtained in the step of registration to classify individual modalities of MR data or a combination thereof into different grades of cancer; whereby the multimodal dataset is MRS volume, T2-w MRI volume, DCE multi-time point MRI volume, and the histology database is whole mount histological sections (WMHS) data.
Other features and advantages of the present invention will become apparent from the following detailed description examples and figures. It should be understood, however, that the detailed description and the specific examples while indicating preferred embodiments of the invention are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
The application contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided by the Office upon request and payment of the necessary fee.
The invention will be better understood from a reading of the following detailed description taken in conjunction with the drawings in which like reference designators are used to designate like elements, and in which:
a)-(f) are example images illustrating CaP detection results on a prostate MRS study.
g)-(i) are example graphs of spectra for the coprresponding detection results of
a)-(c) are examples of intensity histograms corresponding to exemplary MR images;
d)-(f) are exemplary images of MR slices after some pre-processed steps have been applied to exemplary MR slices;
a) is an exemplary MR image;
b)-(d) are exemplary images upon applying gradient operations to the MR image shown in
a)-(i) are exemplary images illustrating results of an exemplary algorithm applied on 1.5T prostate MRI studies;
a) is an exemplary MR image with overlaid MRS grid;
b) illustrates MRS spectra curves corresponding to
c) are exemplary spectra curves;
a) is an exemplary MRI section;
b)-(d) are exemplary images after gradient operators are applied on
a)-(h) are exemplary images of MR data after an exemplary scheme and graph embedding (GE) and non-linear dimensionality reduction (NLDR) methods are applied;
a)-18(l) exemplary images that are useful for describing sample classification;
m) illustrates the best Receiver-Operating Characteristic (ROC) curves;
a)-(c) are illustrative image intensity histograms for non-lesion areas;
a) is an exemplary MR image;
b)-(d) are corresponding exemplary images after exemplary segmentation schemes are applied to the MR image of
a)-(c) are exemplary T2 MR images with MRS superimposed MRS grid;
d)-(f) are 3D exemplary plots with clustered features of the images shown in
a)-(d) are images useful for describing training landmark points before and after alignment;
a) and (b) are exemplary MR images showing outlier weights;
(c) first order statistical (range, κ=3), and (d) second order statistical (Haralick energy, κ=3, G=64, d=1);
This invention relates in one embodiment to computer-assisted diagnostics and classification of prostate cancer. In another embodiment, the invention relates to segmentation of the prostate boundary on MRI images, cancer detection using multimodal multi-protocol MR data; and their integration for a computer-aided diagnosis and classification system for prostate cancer.
In another embodiment, the integration of MRS and MRI improves specificity and sensitivity for screening of cancer, such as CaP, compared to what might be obtainable from MRI or MRS alone.
In one embodiment, comparatively quantitative integration of heterogeneous modalities such as MRS and MRI involves combining physically disparate sources such as image intensities and spectra and is therefore a challenging problem. In another embodiment, a multimodal MR scene c, where for every location c, a spectral and image information is obtained. Let S(c) denote the spectral MRS feature vector at every location c and let f(c) denote the associated intensity value for this location on the corresponding MRI image. Building a meta-classifier for CaP detection by concatenating S(c) and f(c) together at location c is not a proper solution since (i) the different dimensionalities of S(c) and f(c) and (ii) the difference in physical meaning of S(c) and f(c). To improve discriminability between CaP and benign regions in prostate MRI a large number of texture features is extracted in one embodiment from the MRI image. This feature space will form a texture feature vector F(c) at every location c. In another embodiment. the spectral vector S(c) and texture feature vector F(c) will each form a high dimensional feature space at every location c in a given multimodal scene C. In another embodiment, the disparate sources of information may be combined in an embedding space, divorced from their original physical meaning.
Accordingly, in one embodiment, provided herein is a method of identification of an organ cancer from a spectral dataset of magnetic resonance spectroscopy (MRS), comprising the steps of: embedding the spectral data in two hundred and fifty-six dimensional space into a lower dimensional space; applying hierarchical unsupervised k-means clustering to distinguish a non-informative from an informative spectra in the embedded space; pruning objects in the dominant cluster, whereby pruned objects are eliminated from subsequent analysis; and identifying sub-clusters corresponding to cancer.
In another embodiment, provided herein is a method of segmenting regions on an in-vivo tissue MRI, comprising the steps of: obtaining a two-dimensional (T1-w or T2-w or DCE) MR image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene; extracting an image feature or inherent kinetic features or a combination thereof from the MR scene; embedding the extracted image feature into a low dimensional space, thereby reducing the dimensionality of the extracted image feature; clustering the embedded space to a predetermined class, wherein the clustering is achieved by partitioning the features in the embedded space to disjointed regions, thereby creating classes; and using consensus clustering, segmenting the embedded space.
In one embodiment, provided herein is a method of unsupervised classification of a prostate image dataset comprising the steps of: obtaining a magnetic resonance spectroscopy (MRS) dataset, said dataset defining a scene using MR spectra, identifying cancer sub-clusters; obtaining a MR image of the cancer sub-clusters; segmenting the MR image to pre-determined cancer classes; integrating the magnetic resonance spectra dataset and the magnetic resonance imaging dataset, thereby redefining the scene.
In one embodiment, provided herein is a system for performing unsupervised classification of a prostate image dataset comprising the steps of: obtaining a magnetic resonance spectroscopy (MRS) dataset, said dataset defining a scene using MR spectral data, identifying cancer sub-clusters via use of spectral data; obtaining a T1-w or T2-w MR image scene of the said dataset; segmenting the MR image into cancer classes; integrating the magnetic resonance spectra dataset and the magnetic resonance imaging dataset (T1-w or T2-w), thereby redefining the scene; the system for analysis of the redefined scene comprising a module for identifying cancer sub-clusters from integrated spectral and image data; a manifold learning module; and a visualization module.
In one embodiment, provided herein is a method of auomatically segmenting the prostate boundary on an MRI image, comprising the steps of: obtaining a training MRI dataset; using an expert selected landmarks on the training MRI data, obtaining statistical shape model; using a feature extraction method on training MRI data, obtaining statistical appearance model; using automated hierarchical spectral clustering of the embedded space to a predetermined class on a corresponding MRS dataset, hence obtaining region of interest (ROI); using the region of interest, the statistical shape model and statistical appearance model, initialize a segmentation method of the prostate boundary on an MRI image, which shall then automatically determine the prostate boundary.
In one embodiment, provided herein is a method of unsupervised determination of pair-wise slice correspondences between histology and MRI via group-wise mutual information, comprising the steps of: obtaining a histology and an MRI Dataset; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; extracting an image feature from the pre-processed MR scene; determining similarity metrics between the extracted image feature of the MRI dataset and histology dataset to find optimal correspondences; whereby the dataset and histology correspond to the MRI volume and whole mount histological sections (WMHS) respectively.
In one embodiment, provided herein is a method of supervised classification for automated cancer detection using Magnetic Resonance Spectroscopy (MRS), comprising the steps of: obtaining a multimodal dataset comprising MRI image, MRS spectra and histology database; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; using data manipulation methods, pre-processing the MRS data; from corresponding MRI and histology regions obtained in the step of registering the correspondence of whole mount histological sections (WMHS) and MRI slices determining cancer and normal MRS spectra; determining similarities between input test spectra with cancer and normal spectra; and classifying MRS data as cancerous or normal; whereby the MRI image, MRS spectra and histology database is T2-w MRI volume, MRS volume, whole mount histological sections (WMHS) data respectively.
In one embodiment, provided herein is a method of classification for automated prostate cancer detection using T2-weighted Magnetic Resonance Imaging (T2-w MRI) comprising the steps of: obtaining a multimodal dataset comprising MRI image, MRS spectra and histology database; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the correspondence of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; extracting an image feature from the pre-processed MR scene; analyzing feature representation of MRI dataset; and classifying the MRI data as cancerous or normal; whereby the MRI image, MRS spectra and histology database is T2-w MRI volume, MRS volume, whole mount histological sections (WMHS) data respectively.
In one embodiment, provided herein is a method of supervised classification for automated prostate cancer detection using Dynamic Contrast Enhanced Magnetic Resonance Imaging (DCE MRI), comprising the steps of: obtaining a DCE MRI multi-time point volume dataset and WMHS dataset; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; extracting an image feature from the pre-processed MRI scene; analyzing feature representation of MRI dataset; and classifying the MRI data as cancerous or normal; whereby the MRI image and histology database is DCE MRI volume, MRS volume, whole mount histological sections (WMHS) data respectively.
In one embodiment, provided herein is a method of supervised classification for automated cancer detection using integration of MRS and (T1-w or T2-w or DCE) MRI, comprising the steps of: obtaining a multimodal dataset comprising MRI image, MRS spectra and histology database; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI; processing the MRS dataset; determining similarities between input MRS test spectra with cancer and normal MRS spectra; extracting an image feature or inherent kinetic feature or a combination thereof from the pre-processed MR scene; analyzing feature representation of MRI dataset; and classifying the sets of spectral similarities and extracted image feature representation data with training data obtained in the step of mutual registration between WMHS and MRI to classify MR data as cancerous or normal; whereby the MRI image, MRS spectra and histology database is (T1-w or T2-w or DCE) MRI volume, MRS volume, whole mount histological sections (WMHS) data respectively.
In one embodiment, provided herein is a method of supervised classification for automated prostate cancer detection using integration of DCE MRI and T2-w MRI, comprising the steps of: obtaining a multimodal dataset comprising DCE MRI image, T2-w MRI image and histology database; automatically segmenting the prostate boundary on (DCE and T2-w) MRI images; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating corrected (DCE and T2-w) MR scenes and pre-processing the segmented (DCE and T2-w) MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed (DCE and T2-w) MR images; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on (DCE and T2-w) MRI; extracting image features from the pre-processed (DCE and T2-w) MR scenes; analyzing feature representations of (DCE and T2-w) MRI datasets; and classifying the sets of extracted image features and feature representation data from (DCE and T2-w) MR data with training data obtained in the step of registration to classify MR data as cancerous or normal; whereby the DCE MRI image, T2-w MRI image and histology database is DCE multi-time point volume, T2-w MRI volume and whole mount histological sections (WMHS) data respectively.
In one embodiment, the provided herein is a method of supervised classification for automated prostate cancer detection using integration of MRS, DCE MRI and T2-w MRI, comprising the steps of: obtaining a multimodal dataset comprising (DCE and T2-w) MRI images, MRS spectra and histology database; automatically segmenting the prostate boundary on an MRI image; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the determined correspondences of WMHS and MRI slices to obtain cancerous and non-cancerous regions on MRI and so also determining cancer and normal MRS spectra; pre-processing the MRS dataset; determining similarities between input MRS test spectra and cancer and normal MRS spectra; extracting image features from the pre-processed (DCE and T2-w) MR scenes; analyzing the image feature representations of MRI dataset; and classifying the sets of spectral similarities, extracted image features representation data with training data obtained in the step of registration to classify multimodal MR data in combination or as individual modalities as cancerous or normal; whereby the multimodal dataset is MRS volume, T2-w MRI volume, DCE multi-time point MRI volume, and the histology database is whole mount histological sections (WMHS) data.
In one embodiment, provided herein is a method of supervised classification for automated detection of different grades of prostate cancer using mutimodal datasets comprising the steps of: obtaining a multimodal dataset comprising (DCE and T2-w) MRI images, MRS spectra and histology database; automatically segmenting the prostate boundary on MRI images; correcting bias field inhomogeneity and non-linear MR intensity artifacts, thereby creating a corrected MR scene and pre-processing the segmented MRI data; determining pair-wise slice correspondences between the histology dataset and the corrected pre-processed MR image; rigidly or non-rigidly registering the correspondence of WMHS and MRI slices determined in to obtain regions of different Gleason grades on MRI; processing the MRS dataset; determining the MRS spectra corresponding to different Gleason grade MRI regions obtained in the step of registration; determining similarities between input test spectra with different Gleason grade MRS spectra obtained in the step of correspondences; extracting image features or kinetic features or a combination thereof from the pre-processed (DCE and T2-w) MR scenes; analyzing the image feature representations of (DCE and T2-w) MRI datasets; and classifying the sets of spectral similarities, extracted image feature representation data with training data obtained in the step of registration to classify multimodal MR data in combination or as individual modalities into different grades of cancer; whereby the multimodal dataset is MRS volume, (T1-w or T2-w) MRI volume, DCE multi-time point MRI volume, and the histology database is whole mount histological sections (WMHS) data.
In one embodiment, the method of unsupervised learning schemes for segmentation of different regions on MRI described herein provide for the use of over 500 texture features at multiple scales and orientations (such ad gradient, first and second order statistical features in certain discrete embodiments) attributes at multiple scales and orientations to every spatial image location. In another embodiment, the aim is to define quantitative phenotype signatures for discriminating between different prostate tissue classes. In another embodiment, the methods of unsupervised learning schemes for segmenting different regions of MRI described herein provide for the use of manifold learning to visualize and identify inter-dependencies and relationships between different prostate tissue classes, or other organs in another discrete embodiment.
In one embodiment, the methods of unsupervised learning schemes for segmenting different regions of MRI described herein provide for the consensus embedding scheme as a modification to said manifold learning schemes to allow unsupervised segmentation of medical images in order to identify different regions, and has been applied to high resolution in vivo endorectal prostate MRI. In another embodiment, the methods of unsupervised learning schemes for segmenting different regions of MRI described herein provide for the consensus clustering scheme for unsupervised segmentation of the prostate in order to identify cancerous locations on high resolution in vivo endorectal prostate MRI.
In one embodiment, a novel approach that integrates a manifold learning scheme (spectral clustering) with an unsupervised hierarchical clustering algorithm to identify spectra corresponding to cancer on prostate MRS is presented. In another embodiment, the high dimensional information in the MR spectra is non linearly transformed to a low dimensional embedding space and via repeated clustering of the voxels in this space, non informative spectra are eliminated and only informative spectra retained. In another embodiment, the methods described herein identified MRS cancer voxels with sensitivity of 77.8%, false positive rate of 28.92%, and false negative rate of 20.88% on a total of 14 prostate MRS studies.
In one embodiment, a novel Consensus-LLE (C-LLE) scheme which constructs a stable consensus embedding from across multiple low dimensional unstable LLE data representations obtained by varying the parameter (κ) controlling locally linearity is presented. In another embodiment the utility of C-LLE in creating a low dimensional stable representation of Magnetic Resonance Spectroscopy (MRS) data for identifying prostate cancer is demonstrated. In another embodiment, results of quantitative evaluation demonstrate that the C-LLE scheme has higher cancer detection sensitivity (86.90%) and specificity (85.14%) compared to LLE and other state of the art schemes currently employed for analysis of MRS data.
In one embodiment, provided herein is a powerful clinical application for segmentation of high resolution invivo prostate MR volumes by integration of manifold learning (via consensus embedding) with consensus clustering. In another embodiment, the methods described herein enable the allow use of over 500 3D texture features to distinguish between different prostate regions such as malignant in one embodiment, or benign tumor areas on 1.5 T and 3 T resolution MRIs. In one embodiment, the methods provided herein describes a novel manifold learning and clustering scheme for segmentation of 3 T and 1.5 T in vivo prostate MRI. In another embodiment, the methods of evaluation of imaging datasets described herein, with respect to a defined “potential cancer space” show improved accuracy and sensitivity to cancer. The final segmentations obtained using the methods described herein, are highly indicative of cancer.
In one embodiment, the methods provided herein describe a powerful clinical application for segmentation of high resolution in vivo prostate MR volumes by integration of manifold learning and consensus clustering. In another embodiment, after extracting over 500 texture features from each dataset, consensus manifold learning methods and a consensus clustering algorithm are used to achieve optimal segmentations. In another embodiment a method for integration of in vivo MRI data with MRS data is provided, involving working at the meta-voxel resolution of MRS in one embodiment, which is significantly coarser than the resolution of in vivo MRI data alone. In another embodiment, the methods described herein, use rigorous quantitative evaluation with accompanying histological ground truth on the ACRIN database.
In one embodiment, provided herein is a powerful clinical application for computer-aided diagnosis scheme for detection of CaP on multimodal in vivo prostate MRI and MRS data. In another embodiment, the methods described herein provide a meta-classifier based on quantitative integration of 1.5 Tesla in vivo prostate MRS and MRI data via non-linear dimensionality reduction. In another embodiment, over 350 3D texture features are extracted and analyzed via consensus embedding for each MRI scene. In one embodiment, MRS features are obtained by non-linearly projecting the 256-point spectral data via schemes such as graph embedding in one embodiment, or locally linear embedding in another embodiment. Since direct integration of the spectral and image intensity data is not possible in the original feature space owing to differences in the physical meaning of MRS and MRI data, the embodiments describing the integration schemes used in the methods and systems described herein involve data combination in the reduced dimensional embedding space. The individual projections of MRI and MRS data in the reduced dimensional space are concatenated in one embodiment; and used to drive the meta-classifier. In another embodiment, unsupervised classification via consensus clustering is performed on each of MRI, MRS, and MRS+MRI data in the reduced dimensional embedding space. In one embodiment, the methods described herein allow for qualitative and quantitative evaluation of the scheme for 16 1.5 T MRI and MRS datasets. In another embodiment, the novel multimodal integration scheme described herein for detection of cancer in a subject, such as CaP in one embodiment, or breast cancer in another embodiment, demonstrate an average sensitivity of close to 87% and an average specificity of nearly 84% at metavoxel resolution.
In one embodiment of the methods of integrating MRS data and MRI dataset for the detection and classification of prostate cancer as described herein, a hierarchical MRS segmentation scheme first identifies spectra corresponding to locations within the prostate and an ASM is initialized within the bounding box determined by the segmentation. Non-informative spectra (those lying outside the prostate) are identified in another embodiment, as ones belonging to the largest cluster in the lower dimensional embedding space and are eliminated from subsequent analysis. The ASM is trained in one embodiment of the methods described herein, by identifying 24 user-selected landmarks on 5 T2-MRI images. By using transformations like shear in one embodiment, or rotation, scaling, or translation in other, discrete embodiment, the current shape is deformed, with constraints in place, to best fit the prostate region. The finer adjustments are made in another embodiment, by changing the shape within ±2.5 standard deviations from the mean shape, using the trained ASM. In the absence of a correct initialization, in one embodiment the ASMs tend to select wrong edges making it impossible to segment out the exact prostate boundaries.
In one embodiment, the starting position for most shape based segmentation approaches requires manual initialization of the contour. This limits the efficacy of the segmentation scheme in that it mandates user intervention. The methods of segmentation described herein use the novel application of MRS to assist and initialize the segmentation with minimal user intervention. In another embodiment, the methods described herein show that ASMs can be accurately applied to the task of prostate segmentation. In one embodiment, the segmentation methods presented herein are able to segment out the prostate region accurately.
Provided herein, is a novel, fully automated prostate segmentation scheme that integrates spectral clustering and ASMs. In one embodiment, the algorithm comprises 2 distinct stages: spectral clustering of MRS data, followed by an ASM scheme. For the first stage, a non-linear dimensionality reduction is performed, followed by hierarchical clustering on MRS data to obtain a rectangular ROI, which will serve as the initialization for an ASM. Several non-linear dimensionality reduction techniques were explored, and graph embedding was decided upon to transform the multi-dimensional MRS data into a lower-dimensional space. Graph embedding refers in one embodiment to a non-linear dimensionality reduction technique in which the relationship between adjacent objects in the higher dimensional space is preserved in the co-embedded lower dimensional space. In another embodiment, by clustering of metavoxels in this lower-dimensional space, non-informative spectra is eliminated. This dimensionality reduction and clustering is repeated in another embodiment hierarchically to yield a bounding box encompassing the organ or tissue of interest (e.g. prostate). In the second stage, the bounding box obtained from the spectral clustering scheme serves as an initialization for the ASM, in which the mean shape is transformed to fit inside this bounding box. Nearby points are then searched to find the border, and the shape is updated accordingly. The afore-mentioned limitations of the MD led to the use of mutual information (MI) to calculate the location of the prostate border. Given two images, or regions of gray values I1 and I2, the MI between I1 and I2 indicates in one embodiment, how well the gray values can predict one another. It is used in another embodiment for registration and alignment tasks.
In one embodiment MI is used to search for a prostate boundary. For each training image, a window of intensity values surrounding each manually landmarked point on the prostate boundary is taken. Those intensity values are the averaged to calculate the mean ‘expected’ intensity values. The advantages of using MI over MD are: (1) the number of points sampled is not dependent on the number of training images, and (2) MI does not require an underlying Gaussian distribution, as long as the gray values are predictive of one another.
Finally, once a set of pixels presumably located on the border of the tissue or organ of interest are determined (henceforth referred to as ‘goal points’ the shape is updated to best fit these goal points. A weighting scheme is introduced in one embodiment for fitting the goal points. The goal points are weighted using two values. The first value is the normalized MI value. MI is normalized by the Entropy Correlation Coefficient (ECC), which rescales the MI values to be between 0 and 1. The second weighting value is how well the shape fit each goal point during the previous iteration, which is scaled from 0 to 1. This is the ‘outlier weight,’ where if the shape model couldn't deform close to a goal point, it is given a value close to 0, and if the shape model was able to deform close to a goal point, it is given a value close to 1. These two terms are multiplied together to obtain the final weighting factor for each landmark point. It's important to note that as in traditional ASM schemes, the off-line training phase needs to only be done once, while the on-line segmentation is fully automated.
In one embodiment, the methods and systems described herein provide a fully automated scheme, by performing spectral clustering on MRS data to obtain a bounding box, used to initialize an ASM search, using MI instead of MD to find points on the prostate border, as well as using outlier distances and MI values to weight the landmark points and also, an exhaustive evaluation of the model via randomized cross validation is performed to assess segmentation accuracy against expert segmentations. In another embodiment, model parameter sensitivity and segmentation efficiency are also assessed.
In one embodiment, MRI and MRS have individually proven to be excellent tools for automated prostate cancer detection in the past. In one embodiment the novel methods of integration described herein, to integrate both the structural and the chemical information obtained from MRI and MRS to improve on the sensitivity and specificity of CAP detection. 2 novel approaches for integration of multimodal data are described herein in the methods provided. In one embodiment, the performance of classifying high dimensional feature data is compared against classifying low dimensional embeddings of high dimensional data. In one embodiment, the methods described herein show that sensitivity and specificity could be improved by integrating both modalities together. In another embodiment, MRS data performs better than MRI data when the results are compared at the meta-voxel level. In another embodiment, the results obtained by fusion of MRS and MRI data, show an improvement in the PPV value as well as the FN rate as compared to MRS data; and are comparable in terms of the sensitivity to prostate cancer. In one embodiment, provided herein is a method to analyze in vivo MR data to achieve automated segmentation of prostatic structures which operates at the much finer MRI voxel level. The methods described herein operates in another embodiment, at a coarser resolution, but has clinical application in the field of cancer screening making it more robust from this perspective.
In one embodiment, provided herein is a novel comprehensive methodology for segmentation, registration, and detection of cancer, such as prostate cancer in one embodiment, from 3 Tesla in vivo DCE MR images. A multi-attribute active shape model based segmentation scheme (MANTRA) is used in another embodiment, to automatically segment the prostate from in vivo DCE and T2-w images, following which a multimodal registration algorithm, COFEMI, is used to map spatial extent of CaP from corresponding whole mount histology to the DCE-MRI slices. Owing to the presence of MR image intensity non-standardness, a non-linear DR scheme (LLE) is used, coupled with consensus clustering to identify cancerous image pixels. In another embodiment, CaP detection results, 60.72% sensitivity, 83.24% specificity, and 77.20% accuracy, and were superior compared to those obtained via the 3TP method (41.53% sensitivity, 70.04% specificity, 67.37% accuracy).
In one embodiment, provided herein is a Multi-Attribute, Non-Initializing, Texture Reconstruction Based Active Shape Model (MANTRA). In another embodiment, the MANTRA scheme described herein provides a PCA-based texture models are used to better represent the border instead of simply using mean intensities as in the traditional ASM. In one embodiment, the MANTRA scheme described herein uses CMI is as an improved border detection metric to overcome several inherent limitations with the Mahalanobis distance. In another embodiment, the use of kNN entropic graphs in the MANTRA scheme described herein makes it possible to compute CMI in higher dimensions. In another embodiment, using multiple attributes in the MANTRA scheme described herein gives better results than simply using intensities. In another embodiment, a multi-resolution approach is used to overcome initialization bias, and problems with noise at higher resolutions in the MANTRA scheme described herein, which is used in the methods and systems provided herein. In one embodiment, MANTRA was successful with different field strengths (1.5T and 3T) and on multiple protocols (DCE and T2). In another embodiment incorporation of multiple texture features increases results significantly, indicating that a multi-attribute approach is advantageous.
In another embodiment provided herein is a novel segmentation algorithm: Multi-Attribute Non-Initializing Texture Reconstruction Based ASM (MANTRA), comprising a new border detection methodology, from which a statistical shapes model can be fitted.
In one embodiment, provided herein is the use of the Adaboost method in conjunction with the prostate boundary segmentation scheme (MANTRA) to create a unique feature ensemble for each section of the object border, thus finding optimal local feature spaces. Each landmark point can therefore move in its own feature space, which is combined using a weighted average of the Cartesian coordinates to return a final new landmark point. Tests show that the method disclosed herein is more accurate than the traditional method for landmark detection, and converges in fewer iterations.
In one embodiment, provided herein is an integrated detection scheme for high resolution in vivo 3 Tesla (T) structural T2-w and functional DCE MRI data for the detection of CaP. A supervised classifier is trained using MR images on which the spatial extent of CaP has been obtained via non-rigid registration of corresponding histology and MR images. Textural representations of the T2-w data and the multiple time-point functional information from DCE data are integrated in multiple ways for classification. In another embodiment, classification based on the integration of T2-w texture data and DCE was found to significantly outperform classification based on either of the individual modalities with an average area under the ROC curve of 0.692 (as compared to 0.668 and 0.531 for the individual modalities). The methods described herein operates in another embodiment suggesting that the fusion of structural and functional information yields a higher diagnostic accuracy as compared any individual modality. In another embodiment, the methods described herein provide for the integration of such disparate modalities in a space divorced from the original phyisicality of the modalities; ensuring better diagnostic accuracy.
The term “about” as used herein means in quantitative terms plus or minus 5%, or in another embodiment plus or minus 10%, or in another embodiment plus or minus 15%, or in another embodiment plus or minus 20%.
The following examples are presented in order to more fully illustrate the preferred embodiments of the invention. They should in no way be construed, however, as limiting the broad scope of the invention.
Most automated analysis work for MRS for cancer detection has focused on developing fitting techniques that yield peak areas or relative metabolic concentrations of different metabolites like choline, creatine and citrate as accurately as possible. The automated peak finding algorithms suffer from problems associated with the noisy data which worsens when a large baseline is present along with low signal to noise ratio. z-score (ratio of difference between population mean and individual score to the population standard deviation) analysis was suggested as an automated technique for quantitative assessment of 3D MRSI data for glioma. A predefined threshold value of the z score is used to classify spectra in two classes: malignant and benign. Some worked on the quantification of prostate MRSI by model based time fitting and frequency domain analysis. Some researchers have applied linear dimensionality reduction methods such as independent component analysis (ICA), principal component analysis (PCA) in conjunction with classifiers to separate different tissue classes from brain MRS. However, it was previously demonstrated that due to inherent non linearity in high dimensional biomedical studies, linear reduction methods are limited for purposes of classification.
Methods
Data Description
We represent the 3D prostate T2 weighted scene by C=(C. ƒ) where C is a 3D grid of voxels cεC and ƒ is a function that assigns an intensity value to every cεC. A spectral image Cs=(G, g) is also defined where G is also a 3D grid superposed on C and G⊂C. For every spatial location uεG, there is an associated spectra g(u). Hence while ƒ(c) is a scalar, to g(u) is a 256 dimensional vector valued function. Note that the size of the spatial grid locations uεG is equivalent to 8×8 voxels cεC.
The spectral datasets used for the study were collected during the ACRIN multi-site trial of prostate MRS acquired with 1.5 Tesla GE Medical Systems through the PROSE(c) package (voxel width 0.4×0.4×3 mm) Datasets were obtained from 14 patients having CAP with different degrees of severity. The spectral grid was contained in DICOM images from which the 16×16 grid containing N=256 spectra was obtained using IDL. Of these N spectra, over half are zero-padded or non-informative lying outside the prostate.
Determining Ground Truth for Cancer Extent on Prostate MRS
Since whole mount histological sections corresponding to the MR/MRS studies were not available for the ACRIN database, it was only possible to determine approximate spatial extent of cancer within G. During the MRSI ACRIN study, arbitrary divisions were established by the radiologists to obtain a rough estimate of the location of cancer. The prostate was first divided into two regions: Left (L) and Right (R) and the slices were then further divided into three regions: Base (B), Midgland (M) and Apex (A). Thus a total of 6 potential cancer locations were defined: Left Base (LB), Left Midgland (LM), Left Apex (LA), Right Base (RB), Right Midgland (RM) and Right Apex (RA). Presence or absence of cancer in each of these 6 candidate locations, determined via needle biopsy, was recorded. The maximum diameter of the cancer was also recorded in each of the 6 candidate locations.
For a MRS scene Cs, with known cancer in left midgland (LM), the prostate contained in a 3×6 grid and prostate midgland extending over 2 contiguous slices, a potential cancer space GP⊂G was defined within which the cancer is present. If G is separated into two equal right and left halves of 3×3, the total number of voxels uεGP is 18 (3×3×2). The total number of actual cancer voxels within the cancer space, GP, is obtained by knowledge of maximum diameter of cancer and given as:
where ┌┐ refers to the ceiling operation and Δx, Δy refer to the size of voxel u in the x and y dimensions. Hence for a study with a cancer with maximum diameter of 13.75 mm in LM, 8 voxels within GP corresponds to cancer. It is noted that the cancer ground truth determined, does not convey information regarding the precise spatial location of cancer voxels within GP, only the number.
Manifold Learning Via Spectral Clustering
The spectra g(u), for uεG lies in a 256 dimensional space. Hence, the goal is to find a embedding vector {circumflex over (X)}(u)) for each voxel uεG, and its associated class ω such that the distance between q(u) and ω is monotonically related to G in the lower dimensional space. Hence if voxels u, vεG both belong to class ω, then [{circumflex over (X)}(u)−{circumflex over (X)}(v)]2 should be small. To compute the optimal embedding, first a matrix W is defined, representing the similarity between any two objects u, vεG in high dimensional feature space.
W(u,v)=e∥g(u)−g(r)∥εR|G|×|G|, (1)
where |G| is the cardinality of set G. The embedding vector {circumflex over (X)} is obtained from the maximization of the function:
where D(u, u)=ΣvW(u, v) and γ=|G|−1. The embedding space is defined by the eigenvectors corresponding to the smallest A eigenvalues of (D−W){circumflex over (X)}=XDW for every uεG, the embedding {circumflex over (X)}(u) contains the coordinates of u in the embedding space and is given as, {circumflex over (X)}(u)=[ĉA(u)∥Λε{1, 2, . . . , β}] where êA(u), is a A dimensional vector of eigen values associated with u.
Hierarchical Cascade to Prune Non-Informative Spectra
At each iteration t, for a subset of voxels u, {tilde over (G)}t⊂G is obtained by eliminating the non-informative spectra g(u). The voxels uε{tilde over (G)}t are aggregated into clusters VT1, VT2, VT3 by applying k-means clustering to all uεG in the low dimensional embedding {circumflex over (X)}(u). The number of clusters k=3 was chosen empirically to correspond to cancer, benign and classes whose attributes are intermediate to normal tissue and cancer (e.g. benign hyperplasia (BPH), high-grade prostatic intraepithelial neoplasia (HGPIN)). Initially, most of the locations uεG correspond to zero padded or non informative spectra and hence the scheme proceeds by eliminating the dominant cluster. Clusters corresponding to cancer and areas within the prostate only become resolvable at higher levels of the cascade scheme after elimination of the dominant non informative spectra. The algorithm below describes precisely how the methodology works.
Since an unsupervised learning approach is used, it is not clear which of VT1, VT2, VT3, actually represents the cancer cluster. The motivation behind the Hierarclust MRS prostate algorithm however is to obtain clusters VT1, VT2, VT3 which represent, to the extent possible, distinct tissue classes.
Results
Qualitative Results
The images in
a)-(c) shows the potential cancer location volume within G on a single 2D slice of a T2 weighted prostate MR scene for 3 different studies and
Quantitative Results
Table 1 shows the quantitative results for 14 different studies. True positive (TP), False positive (FP) and False negative (FN) fractions for every dataset were obtained by comparing the automated results with the ground truth voxels for all the 3 classes obtained. The class which corresponded to maximum TP and minimum FP and FN rates was identified as the cancer class and the respective TP, FP and FN values were reported for that particular class. TP, FP and FN percentage values for each of the dataset were then calculated by dividing the TP, FP and FN fraction by the total number of ground truth voxels determined as described in Section 2.2. Average results over 14 studies have been reported. Clearly the scheme appears to have high cancer detection sensitivity and specificity.
Due to inherent non-linearities in biomedical data, non-linear dimensionality reduction (NLDR) schemes such as Locally Linear Embedding (LLE) have begun to be employed for data analysis and visualization. LLE attempts to preserve geodesic distances between objects, while projecting the data from the high to the low dimensional feature spaces unlike linear dimensionality reduction (DR) schemes such as Principal Component Analysis (PCA) which preserve the Euclidean distances between objects. The low dimensional data representations obtained via LLE are a function of κ, a parameter controlling the size of the neighborhood within which local linearity is assumed and used to approximate geodesic distances. Since LLE is typically used in an unsupervised context for visualizing and identifying object clusters, a priori, the optimal value of κ is not-obvious.
Roweis and Saul suggest that varying κ over a wide range of values, still yields stable, consistent low dimensional embeddings for dense synthetic datasets. Previous experiments on real biomedical data, suggest otherwise. Further, for sparsely populated datasets, the most common failure of LLE is to map faraway points to adjacent locations in the embedding space depending on the choice of κ.
Automatically estimating κ is largely an open problem, though an attempt was made to adaptively determine a globally optimal κ value. However, these experiments were limited is to dense synthetic datasets. The operating assumption here is that in general, no single global optimal value of κ can be applied to learning the low dimensional manifold over the entire data space and that different values of κ are required in different regions of the data space to optimally reconstruct locally linear neighborhood.
The novel Consensus-LLE (C-LLE) algorithm creates a stable low dimensional representation of the data, in a manner analogous to building classifier ensembles such as Breiman's Bagging scheme. Instead of attempting to estimate a single globally optimal κ value as in to be applied to the entire dataset, the scheme aims to estimate the true pairwise object adjacency D(c, d) in the low dimensional embedding between two objects c, dεC. The problem of estimating object distances D(c, d) is formulated as a Maximum Likelihood Estimation problem (MLE) from multiple approximations Dκ(c, d) obtained by varying κ, which are assumed to be unstable and uncorrelated. The scheme disclosed herein thus differs from related work in two fundamental ways: (a) C-LLE attempts to reconstruct the true low dimensional data manifold by learning pairwise object distance across the entire data space and avoids the κ estimation problem, and (b) C-LLE learns the low dimensional manifold in a locally adaptive fashion, compared to that attempt to learn an optimal κ value which is then uniformly applied to learning the manifold across the entire data space.
Prostate MRS is a non-invasive imaging technique used to detect changes in the concentration of the specific metabolites (choline, creatine and citrate) which have been shown to be representative of prostate cancer (CaP). Peak detection algorithms to automatically determine metabolite peaks have been found to be sensitive to presence of noise and other biomedical signal artifacts. In Example 1, a novel hierarchical clustering algorithm employing NLDR was presented, for automatically distinguishing between normal and diseased spectra within the prostate. In this Example, the newly developed C-LLE algorithm is used to distinguishing between benign and cancerous MR spectra on a total of 18 studies and compare C-LLE's performance with LLE and 3 other state of the art MRS analysis methods. In addition, disclosed in this Example is the development of a novel Independent Component Analysis (ICA) algorithm to (a) automatically determine search ranges within which to perform peak detection, and (b) validating the clusters obtained via C-LLE.
Consensus-Locally Linear Embedding (C-LLE)
Issues with LLE
The objective behind LLE is to non-linearly map objects c, dεC that are adjacent in the M dimensional ambient space (F(c), F(d)) to adjacent locations in the low dimensional embedding (S(c), S(d)), where (S(c), S(d)) represent the m-dimensional dominant eigen vectors corresponding to c, d(m<<M). If d is in the κ neighborhood of cεC, then c, dεC are assumed to be linearly related. LLE attempts to non-linearly project each F(c) to S(c) so that the neighborhood of cεC is preserved. LLE is sensitive to the choice of κ since different values of κ will result in different low dimensional data representations.
Relationship Between C-LLE and Bagging Classifiers
The aim behind constructing ensemble classifiers such as Bagging [Breiman, 1996] is to reduce the variance and bias across weak classifiers. In Bagging, for an input object cεC, a sequence of weak predictors φ(c, Sk) are generated from K bootstrapped training sets Sk where 1≦k≦K. A strong Bagged classifier φBag(c) is obtained by averaging or voting over the multiple weak classifiers φ(c, Sk), kε{1 . . . K}. An analogous idea is used for C-LLE whereby several weak embeddings, Sκ(c) are combined across different values of κε{1 . . . K} to obtain a comprehensive stable low dimensional data embedding, with lower variance and bias compared to individual weak embeddings. the hypothesis is that for any c, dεC, the pairwise object distance in the low dimensional space is faithfully represented in the stable consensus embedding Ŝ(c), for cεC.
Maximum Likelihood Estimation (MLE) of Object Adjacency
The spirit behind C-LLE is the direct determination of pairwise object adjacencies in the low dimensional embedding space as opposed to κ estimation. For each c, d the aim is to find the true distance {circumflex over (D)}ψ(c, d) between c, dεC in some lower dimensional embedding space, where ψ is an appropriately defined distance metric. Given multiple lower dimensional embeddings, the distance between c, d can be expressed as a distribution Dκ(c, d) where for brevity the metric notation has been dropped. The problem of determining {circumflex over (D)}(c, d) can be posed as a MLE problem. Thus this problem can be rewritten as,
where φD is a set of low dimensional distance estimates between c, dεC, and based on the assumption that the lower dimensional embeddings obtained for κε{1, . . . K} are independent. The goal is to find the MLE of D, {tilde over (D)} that maximizes ln p(φD|D) for c, dεC. Intuitively this corresponds to computing the peak (mode) of the distribution p(φD|{circumflex over (D)}).
Algorithm for C-LLE
(a) Step 1. Multiple Lower Dimensional Embeddings are Generated
By varying κε{1 . . . K} using LLE: Each embedding Sκ(c) will hence represent adjacencies between objects ci, cjεC, i, jε{1 . . . |C|}, where |C| is the cardinality of C. Thus ∥Sκ(ci)−Sκ(cj)∥ψ will vary as a function of κ.
(b) Step 2. Obtain MLE of Pairwise Object Adjacency:
A confusion matrix Wκε|C|×|C representing the adjacency between any two obje ci, cjεC, i, jε{1, . . . , |C|} in the lower dimensional embedding representation Sκ(c) is calculated as:
Wκ(i,j)=Dκ(ci,cj)=∥Sκ(ci)−Sκ(cj)∥ψ (2.4.1)
where ci, cjεC, for i, jε{1, . . . , |C|}, κε{1, . . . , K},and ψ in the case is the L2 norm. MLE of Dκ(ci, cj) is estimated as the mode of all adjacency values in Wκ(i, j) over all κ. This {circumflex over (D)} for all cεC is then used to obtain the new confusion matrix Ŵ.
(c) Step 3. Multidimensional Scaling (MDS):
MDS [Venna, 2006] is applied to Ŵ to achieve the final combined embedding Ŝ(c) for cεC. MDS is implemented as a linear method that preserves the Euclidean geometry between each pair of objects ci, cjεC, i, jε{1, . . . , |C|}. This is done by finding optimal positions for the data points ci, cj in lower-dimensional space through minimization of the least squares error in the input pairwise distances in Ŵ.
Prostate Cancer Detection on MRS Via C-LLE
Notation
We define a spectral scene C=(C, ƒ) where C is a 3D grid of spatial locations. For each spatial location ciεC, iε{1, . . . |C|}, there is an associated 256-dimensional valued spectral vector F(c)=[ƒu(c)|uε{1 . . . , 256}], where ƒu(c) represents the concentration of different biochemicals (such as creatine, citrate, and choline) at every spatial location c.
Data Description
A total of 18 1.5 T in vivo endorectal T2-weighted MRI and MRS ACRIN studies1 were obtained prior to prostatectomy. Partial ground truth for the CaP extent on MR studies is available in the form of approximate sextant locations and sizes for each study. The maximum diameter of the tumor is also recorded in each of the 6 prostate sextants (left base, left midgland, left apex, right base, right midgland, right apex). The tumor size and sextant locations were used to identify a potential cancer space used for performing a semi-quantitative evaluation of the CAD scheme. Additional details on identifying this cancer space are provided in Example 1. 1 http://www.acrin.org/6659 protocol.html
C-LLE and Consensus Clustering for CaP Detection on MRS
To overcome the instability associated with centroid based clustering algorithms, multiple weak clusterings Vt1, Vt2, Vt3, tε{0, . . . T} are generated by repeated application of k-means clustering on the combined low dimensional manifold {tilde over (S)}(c), for all cεC. It is assumed that each prostate spectra could be classified as one of the three classes: cancer, benign and other tissue classes (e.g. benign hyperplasia (BPH)). Each cluster, Vt is a set of objects which has been assigned the same class label by the k-means clustering algorithm. As the number of elements in each cluster tends to change for each such iteration of k-means, a co-association matrix H is calculated, with the underlying assumption that objects belonging to a natural cluster are very likely to be co-located in the same cluster for each iteration. Co-occurrences of pairs of objects ci, cjεC in the same cluster Vt are hence taken as votes for their association. H(i, j) thus represents the number of times ci, cjεC, for i, jε{1, . . . |C|}, were found in the same cluster over T iterations. MDS is applied to II followed by a final unsupervised classification using k-means, to obtain the final stable clusters {circumflex over (V)}1. {circumflex over (V)}2. {circumflex over (V)}3.
Model Based Peak Integration Scheme Via Independent Component Analysis (ICA)
Peak detection on prostate MRS is a difficult problem due to noise and spectral contributions from extra-prostatic regions. A secondary contribution of this paper is a model based approach to localize choline, creatine and citrate peaks based on Independent Component Analysis (ICA). ICA is a multivariate decomposition technique which linearly transforms the observed data into statistically maximally independent components (ICs). For a set of voxels identified offline as cancer, χCaP⊂C, A independent components, αIC, αε{1, . . . Λ} are obtained, which represent spectral contributions of choline, creatine and citrate for prostate cancer. The parts per million (ppm) ranges (νcc, νcr) on the X-axis are then learnt for choline+creatine and citrate from αIC, αε{1, . . . A}. Peak detection is then performed on C to identify choline, creatine and citrate peaks within the ranges and νcc and νcr. Area under the choline+creatine peak (Vcc) and under citrate peak (Vcr) is obtained via integration for all voxels cεC. Zakian index (γ(c)) is then calculated as the ratio of Vcc(c)/Vcr(c). A pre-defined threshold determined by radiologists is used to classify the spectra as cancer/benign based on γ(c) for cεC.
z-Score and Principal Component Analysis (PCA)
z-score is a statistical measure defined as the ratio of the difference between the population mean and individual score to the population standard deviation. For a set of voxels, Φtr of c, Φtr⊂C, the mean spectral vector Fμ=[ƒuμ|uε{1, . . . 256}] is obtained and the corresponding standard deviation vector Fσ=[ƒuσ|uε{1, . . . 256}], where
The z-score at each cεC is given as
where |Φtr| is the cardinality of Φtr. A predefined threshold θz is then used to identify each cεC as cancerous or not based on whether z(c)≧θz. PCA attempts to find the orthogonal axes that contain the greatest amount of variance in the data using eigenvalue decomposition. Similar to the C-LLE scheme, each cεC is described by 5 principal components, SPCA(c) which contain 98% of the data variance. Consensus clustering is then applied on SPCA(c), to cluster each cεC into one of 3 classes.
Results and Discussion
Qualitative Results
Quantitative Results
Table 1 shows average CaP detection sensitivity and specificity over 18 studies obtained from C-LLE (m=4), ICA based peak detection, PCA and z-score. Table 1 (b) shows the sensitivity and specificity results averaged across 18 datasets for C-LLE (m=3, 4, 5) compared to LLE by varying the number of dimensions. Note that the C-LLE scheme has a higher sensitivity and specificity across all dimensions which suggests the efficacy of the scheme. The effectiveness of the scheme for detection of prostate cancer is evident from the quantitative results (Table 1) with both sensitivity and specificity of close to 87% and 85% respectively compared to current state of the art methods peak detection, PCA and z-score. Table 1(b) reveals that C-LLE consistently outperforms traditional LLE across multiple dimensions (m=3, 4, 5).
In many CAD systems, segmentation of the object of interest is a necessary first step, yet segmenting the prostate from in vivo MR images is a particularly difficult task. The prostate is especially difficult to see in in vivo imagery because of poor tissue contrast on account of MRI related artifacts such as background inhomogeneity. While the identification of the prostate boundary is critical for calculating the prostate volume, for creating patient specific prostate anatomical models, and for CAD systems, accurately identifying the prostate boundaries on an MR image is a tedious task, and manual segmentation is not only laborious but also very subjective.
Previous work on automatic or semi-automatic prostate segmentation has primarily focused on TRUS images. Only a few prostate segmentation attempts for MR images currently exist. Segmenting prostate MR images was suggested, based on nonrigid registration, where they register a series of training ‘atlas’ images to the test image, and the set of atlas images are chosen that best match (based on mutual information) the test image. The selected atlas images are then averaged to achieve a segmentation of the prostate in the test image. A 3D method for segmenting the prostate and bladder simultaneously was suggested to account for inter-patient variability in prostate appearance. A major limitation with previous prostate segmentation attempts is that the prostate varies widely between patients in size, shape, and texture.
A popular segmentation method is the Active Shape Model (ASM), a statistical scheme first developed in the mid-90's. ASMs use a series of manually landmarked training images to generate a point distribution model, and principal component analysis is performed on this point distribution model to generate a statistical shape model. Then, the Mahalanobis distance (MD) between possible gray values and the mean gray values (determined from the training images) is minimized to identify the boundary of an object. The MD is a statistical measurement used to determine similarity between sets, and is valid as long as the gray values of the training data have a normal distribution. At this point, the shape model is deformed to best fit the boundary points, and the process is repeated until convergence. While ASMs have been used for different applications, results of prostate segmentation yielded highly variable results, with overlap coefficients ranging from about 0.15 to about 0.85. One of the main shortcomings of the ASM is that it requires careful initialization. This is usually done by manual intervention, which can be tedious, and subject to operator variability. Another method is to start at a very low resolution of the image, overlay the starting shape on it, and then increase the resolution, performing the segmentation at each resolution. While this can work, it is not always guaranteed to work. Also, since the object of interest could be anywhere in the image, very computationally expensive searches could be required for initialization, contributing to a slow overall convergence time. One such search used a shape-variant Hough transform to initialize the segmentation. A highly promising segmentation scheme which searches the entire image was suggested, yet it was stated that properly initialized regions of interest (ROIs), such as those described in the examples provided herein, would greatly improve their algorithm's efficiency. Another limitation of ASMs lies with the inherent limitations in using the MD to find the object of interest. Normally, the point with the minimum MD between its surrounding gray values and the mean gray values is assumed to lie on the border of the object. To compute the MD, a covariance matrix of the training gray values is constructed during the training phase, and the MD calculation uses the inverse of that covariance matrix. However, if the covariance matrix is sparse, then the inverse matrix will be undefined, and consequently the MD will be undefined. If the number of pixels sampled is greater than the number of training images, the covariance matrix will become sparse. For example, at least 225 training images would be required to sample a 15×15 window of gray values. Therefore, either having limited training data or attempting to sample a large number of pixels would prove problematic. Also, the MD assumes that the texture training data has an underlying Gaussian distribution, which is not always guaranteed.
This novel, fully automated prostate segmentation scheme integrates spectral clustering and ASMs consists of an algorithm comprising 2 distinct stages: spectral clustering of MRS data, followed by an ASM scheme. For the first stage, non-linear dimensionality reduction is performed, followed by hierarchical clustering on MRS data to obtain a rectangular Region of Interest (ROI), which will serve as the initialization point for an ASM.
Several non-linear dimensionality reduction techniques were explored, and graph embedding was decided upon to transform the multi-dimensional MRS data into a lower-dimensional space. Graph embedding is a non-linear dimensionality reduction technique in which the relationship between adjacent objects in the higher dimensional space is preserved in the co-embedded lower dimensional space. By clustering of metavoxels in this lower-dimensional space, non-informative spectra can be eliminated. This dimensionality reduction and clustering is repeated hierarchically to yield a bounding box encompassing the prostate. In the second stage, the prostate bounding box obtained from the spectral clustering scheme serves as an initialization for the ASM, in which the mean shape is transformed to fit inside this bounding box. Nearby points are then search to find the prostate border, and the shape is updated accordingly. The afore-mentioned limitations of the MD led to the use of mutual information (MI) to calculate the location of the prostate border. Given two images, or regions of gray values T1 and T2, the MI between T1 and T2 is an indication of how well the gray values can predict one another. It is normally used for registration and alignment tasks, but in this Example, MI is used to search for the prostate boundary. For each training image, a window of intensity values surrounding each manually landmarked point on the prostate boundary is taken. Then those intensity values are averaged to calculate the mean ‘expected’ intensity values. The idea behind the method is that if a pixel lies on the prostate border, then the MI between its surrounding gray values the mean intensity values of the border will be maximum. The advantages of using MI over MD are the following: (1) the number of points sampled is not dependent on the number of training images, and (2) MI does not require an underlying Gaussian distribution, as long as the gray values are predictive of one another. Finally, once a set of pixels presumably located on the prostate border are determined (henceforth referred to as ‘goal points’), the shape is updated to best fit these goal points. A weighting scheme is introduced for fitting the goal points. The goal points are weighted using two values. The first value is the normalized MI value. MI is normalized by the Entropy Correlation Coefficient (ECC), which rescales the MI values to be between 0 and 1. The second weighting value is how well the shape fit each goal point during the previous iteration, which is scaled from 0 to 1. This is termed the ‘outlier weight,’ where if the shape model couldn't deform close to a goal point, it is given a value close to 0, and if the shape model was able to deform close to a goal point, it is given a value close to 1. These two terms are multiplied together to obtain the final weighting factor for each landmark point. It's important to note that as in traditional ASM schemes, the off-line training phase needs to only be done once, while the on-line segmentation is fully automated.
The primary contributions and novel aspects of this work are:
Finally, an exhaustive evaluation of the model via randomized cross validation is performed to assess segmentation accuracy against expert segmentations. In addition, model parameter sensitivity and segmentation efficiency are also assessed.
System Overview
Notation
A spectral scene Ĉ=(Ĉ, {circumflex over (ƒ)}) is defined, where Ĉ is a 3D grid of metavoxels. For each spatial location ĉεĈ there is an associated 256-dimensional valued spectral vector {circumflex over (F)}(ĉ)=[{circumflex over (ƒ)}j(ĉ)|jε{1, . . . , 256}], where {circumflex over (ƒ)}j(ĉ) represents the concentration of different biochemicals (such as creatinine, citrate, and choline). The associated MR intensity scene C=(C, ƒ) is defined, where C represents a set of spatial locations, and ƒ(c) represents a function that returns the intensity value at any spatial location cεC. It's important to note that the distance between any two adjacent metavoxels ĉ, {circumflex over (d)}εĈ, ∥ĉ−{circumflex over (d)}∥, (where ∥·∥ denotes the L2 norm) is roughly 16 times the distance between any two adjacent spatial voxels c, dεC. A κ-neighborhood centered on cεC is defined as Nκ(c) where for ∀dεNκ(c), ∥d−c∥≦κ, c,∉Nκ(c). In addition, the set of pixel intensities for the κ-neighborhood centered on c is defined as Fκ(c)=[ƒ(d)|dεNκ(c)]. Finally, for any set C, |C| is defined as the cardinality of C.
Data Description
The spectral datasets (consisting of both MRI and MRS data) used for the study were collected during the ACRIN multi-site trial (http://www.acrin.org/6659_protocol.html). All the MRS and MRI studies were 1.5 Tesla. The MRI studies were axial T2 images obtained from 19 patients. The 19 3D studies comprised a total of 148 image sections. These sections correspond to either the base, midgland, or apex of the prostate. Three distinct 2D ASM models were constructed, one each for the base, midgland, and apex. The ground truth was determined by manual outlining of the prostate border on each section by an expert radiologist.
Brief Outline of Methods
MRS Methodology
Non-Linear Dimensionality Reduction Via Graph Embedding
The spectra {circumflex over (F)}(ĉ), for ĉεĈ, lie in a 256 dimensional space. Hence, the goal is to find an embedding vector G(ĉ) for ∀ĉεĈ, and its associated class ω (informative or non-informative) such that if the distances between elements of ω are well preserved in the lower dimensional space. Hence if metavoxels ĉ, {circumflex over (d)}εĈ both belong to class ω, then ∥G(ĉ)−G({circumflex over (d)})∥ should be small. To compute the optimal embedding, first a matrix Wε|Ĉ×|Ĉ| is defined, representing the similarity between all objects in Ĉ. For ∀ĉ, {circumflex over (d)}εĈ, W is defined as
W(L(ĉ),L({circumflex over (d)}))=e−∥{circumflex over (F)}(ĉ)−{circumflex over (F)}({circumflex over (d)})∥ (55.11.1)
where SL(ĉ) represents a unique index position of ĉ=(x, y, z) derived from its x, y, z coordinates in 3D space. The embedding vector G is obtained from the maximization of the function
where γ=|Ĉ|−1. In addition, D is a diagonal matrix where for ∀ĉεĈ, the diagonal element L(ĉ). L(ĉ)) is defined as D(L(ĉ), L(ĉ))=Σ{circumflex over (d)}εĈW(L(ĉ), L({circumflex over (d)})). The embedding space is defined by the Eigenvectors corresponding to the smallest β Eigenvalues of (D−W)G=λDC. A matrix Mε|Ĉ|×β of the first β Eigenvectors is constructed, and for ∀ĉεĈ, G(ĉ) is defined as row L(ĉ) of M. G(ĉ) is therefore a vector consisting of element number L(ĉ) from each of the first β Eigenvectors, which represents the β-dimensional Cartesian coordinates. The graph embedding algorithm is summarized below.
Hierarchical Cascade to Prune Non-Informative Spectra
At each iteration h, a subset of metavoxels Ĉh⊂Ĉ is obtained by eliminating the non-informative spectra. The metavoxels ĉεĈh, are aggregated into clusters Vh1, Vh2, Vh3 by applying k-means clustering to all ĉεĈh, in the low dimensional embedding Gh(ĉ). Initially, most of the locations ĉεĈh, correspond to zero padded or non informative spectra. Therefore, while the unsupervised clustering results in 3 clusters, the dominant cluster (the cluster with the most number of elements) is non-informative and is eliminated.
ASM Methodology
Training the Shape Model and Expected Gray Values
Step 1: The training phase begins by using training images of oblique axial slices and manually selecting M landmarks on the prostate boundary. The set of K training images is denoted as St={Cα|αε{1, . . . , K}}. The apex and the 2 posterolateral-most points on the prostate are landmarked, and the remaining (M−3) landmarks are equally spaced between these landmarks. Therefore, each training image CαεSt has a corresponding set of landmarks XαεCα, where Xα={cmα|mε{1, . . . , M}}, and where cmα=(xmα, ymα) denotes the coordinates of the mth landmark point in Cα.
Step 2: These shapes are aligned using a modified Procrustes analysis.
Then, principal component analysis is performed, so that any valid prostate shape X can be represented as
X=T(
where T is an affine transformation mapping, b is the variable controlling the shape, and P is a matrix with each column representing an Eigenvector. Out of the M Eigenvectors, only the first z are needed to explain most (98%) of the training variation, where if λr is the rth Eigenvalue, z is as small as possible such that (Σr=1zλr)≧(0.98·Σr=1Mλr). Therefore, |b|=z, and P is a matrix consisting of only the first z Eigenvectors. In these experiments, valid prostate shapes are represented as vector b of length 18, with each element in b constrained between ±2.5 standard deviations, with one standard deviation for the rth element in b given as √{square root over (λr)}.
Step 3: The mean gray values must now be calculated for each landmark point, which will represent the expected gray values for the prostate border. First, the κ-neighborhood centered on each cmα as Nκ(cmα) is taken, where αε{1, . . . K}. Then, for ∀duαεNκ(cmα), the function
is defined where uε{1, . . . , |Nκ(cm)|} Finally, the expected gray values are defined as
Initializing the ASM and Using Mutual Information to Find the Prostate Border
At this point, a new image is defined, to search for the prostate as the scene C=(C, ƒ) where CεSt. The very first step of the system is to initialize the landmarks for the first iteration, X1 (where X1⊂C). To do this, the smallest rectangle containing
A set of points presumed to lie on the prostate border must now be calculated. For the nth iteration, the current landmark points cmn, mε{1, . . . , M} are denoted by the set Xn and the set of points {tilde over (c)}mn, mε{1, . . . , M}} presumed to lie on the prostate border is denoted by the set {circumflex over (X)}n. The pixels surrounding each {tilde over (c)}mn, denoted as Nv({tilde over (c)}mn), are all possible locations for the prostate border. For ∀cjεNv({tilde over (c)}mn), a set of nearby intensity values Fκ(cj) is compared with using mutual information (MI). One common way to calculate a normalized MI value is by calculating the Entropy Correlation Coefficient (ECC) between 2 sets of values. If a and b are 2 sets of intensity values, the normalized MI value between them is denoted as (a, b). The location cj with the highest mutual information (MI) value between Fκ(cj) and
{tilde over (c)}mn=argmaxc
and cjεNv(cmn).
Updating the Shape
Xn must now be deformed to {tilde over (X)}n. At this point a weighting scheme is introduced for updating the shape. The vector of weights is denoted as Γn={Γmn|mε{1, . . . , M}}, where each is a weight associated with the landmark point cmn. First, the MI values for each goal point {tilde over (c)}mnε{tilde over (X)}n are compared to one another. The goal point with the highest MI value between Fκ({tilde over (c)}mn) and
where dx={dxm|mε{1, . . . , M}} and dxm=∥
Xn+1=T(
The system then repeats searching for new goal points and updating the shape, until the mean Euclidean distance between Xn and Xn−1 is less than 0.2 pixels, at which time convergence is assumed.
Evaluation Methods
Once the final landmarks have been determined, this shape is compared to a manually outlined expert segmentation. The area based metrics used to evaluate the segmentation system are positive predictive value (PPV), specificity, sensitivity, and global overlap coefficient, with values closer to 1 indicating better results. The edge based metric used to evaluate the system are mean absolute distance (MAD) and Hausdorff distance, which are given in units of pixels with values closer to 0 indicating better results. To evaluate the system, the optimal parameters are first computed, which is followed by a randomized cross validation. To perform the cross validation, the variable q is allowed to represent the number of images to test with, and P is allowed to represent the total number of images in a certain group (either apex, midgland, or base). For each group, q images are randomly selected to test with, and generate an ASM model with (P−q) images. Averaging the metric results for those q images gives a set of mean values (μ) for sensitivity, specificity, overlap, PPV, MAD, and Hausdorff distance. This is repeated 50 times and the mean metric values (
Parameter Selection and Sensitivity Analysis
To evaluate the system, the optimal parameter values must be computed. First, the search area (Nv) was set as a 15×3 rectangle, centered on the current landmark point and rotated to face in the normal direction of the shape at that point, as shown in
Once the optimal parameters were determined, the importance of a correct initialization was evaluated by randomly changing the initialization bounding box. Random ROIs were selected for initializing the ASM system, and these were compared to the ROIs obtained from the MRS data.
Results and Discussion
Qualitative Results
Several qualitative results for the MRS initialization are shown in
Quantitative Results
Table 2 shows the results from the randomized cross validation. A confidence interval for the mean values was also calculated, which is derived from the Student's t-distribution and the standard error. So for any of the metric values, after repeating 50 times, the 99% confidence interval is given as
The Active Shape Model (ASM) and Active Appearance Model (AAM) are both popular methods for segmenting known anatomical structures. The ASM algorithm involves an expert initially selecting landmarks to construct a statistical shape model using Principal Component Analysis (PCA). A set of intensity values is then sampled along the normal in each training image. During segmentation, any potential pixel on the border also has a profile of intensity values sampled. The point with the minimum Mahalanobis distance between the mean training intensities and the sampled intensities presumably lies on the object border. Finally, the shape model is updated to fit these landmark points, and the process repeats until convergence. However, there are several limitations with traditional ASMs with regard to image segmentation. (1) ASMs require an accurate initialization and final segmentation results are sensitive to the user defined initialization. (2) The border detection requires that the distribution of intensity values in the training data is Gaussian, which need not necessarily be the case. (3) Limited training data could result a near-singular covariance matrix, causing the Mahalanobis distance to not be defined.
Alternatives and extensions to the traditional ASM algorithm have been proposed. In an alternative classifier-based method Taylor-series gradient features are calculated and the features that improve classification accuracy during training are used during segmentation. Then, the classifier is used on the features of the test image to determine border landmark points. The classifier approach provides an alternative to the Mahalanobis distance for finding landmark points, but requires an offline feature selection stage. One segmentation algorithm presented implemented a multi-attribute based approach and also allowed for multiple landmark points to be incorporated; however, it still relies on the Mahalanobis distance for its cost function which is not be optimal under certain pertinent circumstances.
MANTRA differs from the traditional AAM in that AAMs employ a global texture model of the entire object, which is combined with the shape information to create a general appearance model. For several medical image tasks however, local texture near the object boundary is more relevant to obtaining an accurate segmentation instead of global object texture, and MANTRA's approach is to create a local texture model for each individual landmark point.
MANTRA comprises of a new border detection methodology, from which a statistical shapes model can be fitted.
(a) Local Texture Model Reconstruction: To overcome the limitations associated with using the Mahalanobis distance, MANTRA performs PCA on pixel neighborhoods surrounding the object borders of the training images to create a local texture model for each landmark point. Any potential border landmark point of the test image has a neighborhood of pixels sampled, and the PCA-based local texture model is used to reconstruct the sampled neighborhood in a manner similar to AAMs. These training reconstructions are compared to the original pixels values to detect the object border, where the location with the best reconstruction is presumably the object border.
(b) Use of Multiple Attributes with Combined Mutual Information: Since mutual information (MI), a metric that quantifies the statistical interdependence of multiple random variables, operates without assuming any functional relationship between the variables, it is used as a robust image similarity measure to compare the reconstructions to the original pixel values. In order to overcome the limitations of using image intensities to represent the object border, 1st and 2nd order statistical features are generated from each training image. These features have were shown to be useful in both computer aided diagnosis systems and registration tasks. To integrate multiple image attributes, a Combined MI (CMI) is used, because of its property to incorporate non-redundant information from multiple sources, and its previous success in complementing similarity measures with information from multiple feature calculations. Since CMI operates in higher dimensions, histogram-based estimation approaches would become too sparse when more than 2 features are used. Therefore, the k nearest neighbor (kNN) entropic graph technique is used to estimate the CMI. The values are plotted in a high dimensional graph, and the entropy is estimated from the distances to the k nearest neighbors, which is subsequently used to estimate the MI value.
(c) Non-requirement of Model Initialization: Similarly to several other segmentation schemes, MANTRA is cast within a multi-resolution framework, in which the shape is updated in an iterative fashion and across image resolutions. At each resolution increase, the area of the search neighborhood decreases, allowing only fine adjustments to be made in the higher resolution. This overcomes the problem of noise near the object boundary and makes MANTRA robust to different initializations.
The experiments were performed on nearly 230 images comprising 3 MR protocols and 2 body regions. Three different 2D models were tested: MANTRA, the traditional ASM, and ASM+MI (a hybrid with aspects of both MANTRA and ASM). Quantitative evaluation was performed against expert delineated ground truth via 6 metrics.
Brief Overview of MANTRA
MANTRA comprises of a distinct training and segmentation step (
Training
Then, a neighborhood surrounding each landmark point is sampled from each of the K feature scenes for all N training images.
The set of N training images is defined as Str={Cα|αε{1 . . . N}}, where Cα=(C, ƒα) is an image scene where Cε2 represents a set of 2D spatial locations and ƒα(c) represents a function that returns the intensity value at any cεC. For ∀CαεStr, Xα⊂C is a set of M landmark points manually delineated by an expert, where Xα={cmα|mε{1, . . . M}} . For ∀CαεStr, K features scenes α,k=(C, ƒα,k), kε{1, . . . , K} are then generated. For implementation in the methods disclosed herein, the gradient magnitude, Haralick inverse difference moment, and Haralick entropy texture features are used. For each training image Cα, and each landmark point cmα, a κ-neighborhood Nκ(cmα) (where for ∀dεNκ(cmα), ∥d−cmα∥2≦κ, cmαεNκ(cmα)) is sampled on each feature scene α,k and normalized. For each landmark point m and each feature k, the normalized feature values for ∀dεNκ(cmα) are denoted as the vector gmα,k=[ƒα,k(d)/Σdƒα,k(d)|dεNκ(cmα)]. The mean vector for each landmark point m and each feature k is given as
and the covariance matrix of gmα,k over ∀αε{1 . . . N} is denoted as φmk. Then, PCA is performed by calculating the Eigenvectors of φmk and retaining the Eigenvectors that account for most (˜98%) of the variation in the training data, denoted as Φmk.
Reconstructing Local Image Texture
We define a test image as the scene Cte, where CteεStr, and its corresponding K feature scenes as k, kε{1, . . . , K}. The M landmark points for the current iteration j are denoted as the set Xtc={cm|mε{1 . . . , M}}. A γ-neighborhood Nγ (where γ≠κ) is searched near each current landmark point cm to identify a landmark point {tilde over (c)}m which is in close proximity to the object border. For j=1, cm denotes the initialized landmark point, and for j≠1, cm denotes the result of deforming to {tilde over (c)}m from iteration (j−1) using the statistical shape model. For ∀eεNγ(cm), a κ-neighborhood Nκ(e) is sampled on each feature scene k and normalized, denoted as the vector gek={ƒk(d)/Σdƒk(d)|dεNκ(c)}. Then, for each e (which is a potential location for {tilde over (c)}m), the K vectors gek, kε{1, . . . , K} are reconstructed from the training PCA models, where the vector of reconstructed pixel values for feature k is given as
Rek=
Identifying New Landmarks in 3 Models: ASM, ASM+MI, and MANTRA
We wish to compare three different methods for finding new landmark points. The first is the traditional ASM method, which minimizes the Mahalanobis distance. The remaining 2 methods utilize the Combined Mutual Information (CMI) metric to find landmark points. The MI between 2 vectors is a measure of how predictive they are of each other, based on their entropies. CMI is an extension of MI, where 2 sets of vectors can be compared intelligently by taking into account the redundancy between the sets. For 2 sets of vectors {A1 . . . An} and {B1 . . . Bn}, where each A and B is a vector of the same dimensionality, the MI between them is given as I(At . . . An, B1 . . . Bn)=II(A1 . . . An)+II(B1 . . . Bn)−II(An . . . AnB1 . . . Bn) where H denotes the joint entropy. To estimate this joint entropy, k-nearest-neighbor (kNN) entropic graphs are used, where II is estimated from average kNN distance
The data consisted of 128 1.5 Tesla (T), T2-weighted in vivo prostate MR slices, 21 3T T1-weighted DCE in vivo prostate MR slices, and 78 1.5T T1-weighted DCE MR breast images. To evaluate the methods, a 10-fold cross validation was performed on each of the datasets for the MANTRA, ASM+MI, and ASM methods, in which 90% of the images were used for training, and 10% were used for testing, which was repeated until all images had been tested.
Quantitative Results
For nearly 230 clinical images, MANTRA, ASM, and ASM+MI were compared against expert delineated segmentations (Expert 1) in terms of 6 error metrics, where PPV and MAD stand for Positive Predictive Value and Mean Absolute Distance respectively. The segmentations of an experienced radiologist (Expert 1) were used as the gold standard for evaluation. Also shown in Table 1 is the segmentation performance of a radiologist resident (Expert 2) compared to Expert 1. Note that MANTRA performs comparably to Expert 2, and in 78% of the 18 scenarios (6 metrics, 3 tests), MANTRA performs better than ASM and ASM+MI. The scenarios in which it failed (specificity and PPV of the prostate) did not take into account false negative area. Using the proposed ASM+MI algorithm performed better than the ASM method but worse than the MANTRA method, suggesting that MI is a more effective metric than the Mahalanobis distance for border detection, but also justifying the use of the reconstructions in MANTRA. In addition, using statistical texture features improved the performance of all results, showing the effectiveness of the multi-attribute approach. For breast segmentation task, all 3 methods performed equivalently, indicating that the new method is as robust as the traditional ASM method in segmenting a variety of medical images.
Qualitative Results
In
Many segmentation algorithms are derivatives of the popular Active Shape Model (ASM) segmentation system, which were previously (Example 1-3 e.g.) used to segment prostate MR images. ASM systems start with an initial shape defined by landmarks, to and then find new landmarks for which to update the shape. The new landmarks are found by minimizing the Mahalanobis distance metric between the testing and training intensity values. However, this is by no means a solved problem, as many times the point with the minimum Mahalanobis distance is not actually the object border. This can be caused by a non-Gaussian distributions of intensity values, or artifacts such as bias field (in MR images) and intensity inhomogeneities. Recently, segmentation algorithms have begun to incorporate multiple statistical texture features to overcome problems with using intensities. One example is ASM with Optimal Features (ASMOF), in which the new landmark points are identified by classifying pixels as inside or outside the object border, and finding features which most accurately perform this classification. This bypasses the requirement of using the Mahalanobis distance completely. Another system is “Minimal shape and intensity cost path segmentation”, which combines the Mahalanobis distance of multiple features and finds the set of landmarks that best match the expected shape of the object. Here select feature ensembles for different regions of the object boundary are selected based on the Adaboost feature selection algorithm. This allows each section of the object boundary to have its own unique strong classifier. Adaboost intelligently selects features based on the types of errors they make and combine several weak classifiers to create a unified strong classifier. For example, if 2 features don't do well individually, but complement each other well by performing different errors, they will be chosen by Adaboost.
Methodology:
There is a distinct offline training phase and an online testing phase, which are outlined below.
1. Training (Offline):
(a) Feature Extraction:
(b) Building Boosted Ensembles for Each Landmark:
2. Testing (Online)
(a) Eature Extraction
(b) Use Previously Obtained Boosted Ensembles to Detect Landmarks
(c) Update Cartesian Coordinates to Determine New Landmark Point
Shown in
\truth. 15 pixels were searched in either direction from the ground truth landmark, and a “new” landmark was found for one iteration. Ideally, this new landmark would be exactly the location of the ground truth landmark, but as can be seen in
The use of computer aided diagnosis (CAD) schemes for early detection and classification of CaP has recently been explored. A novel supervised CAD scheme for detection of CaP from 4 T ex vivo prostate MRI was presented (See e.g. Examples herein). A weighted feature ensemble scheme was used to integrate multiple 3D texture features to generate a likelihood scene in which the intensity at every spatial location corresponded to the probability of cancer being present. Improvements to the method via use of non-linear dimensionality reduction (graph embedding) and multiple classifier systems are reported herein. A multimodal statistical classifier was suggested, which integrated texture features from multi-protocol 1.5 T in vivo MRI to generate a statistical probability map representing likelihoods of cancer for different regions within the prostate was presented. Area under the Receiver-Operating Characteristic (ROC) curve (AUC) was used to estimate the classifier accuracy. A maximum AUC of 0.839 was reported.
In this example a novel unsupervised scheme is presented to segment different regions within 1.5 T endorectal in vivo MR prostate imagery. First corrections for MR related artifacts, bias field inhomogeneity ITK and intensity non-standardness are made. This is followed by extraction of over 350 3D texture features at every spatial location within the MRI image. These features have been previously shown to be able to differentiate cancerous and non-cancerous regions. To avoid the curse of dimensionality associated with high dimensional feature spaces, the textural data at every spatial location is projected non-linearly into a lower dimensional space where the objects (MR voxels in this case) can be clustered into distinct classes. Due to inherent non-linearities in biomedical data, linear dimensionality reduction schemes such as Principal Component Analysis (PCA) have been shown to perform poorly compared to non-linear dimensionality reduction (NLDR) schemes such as Locally Linear Embedding (LLE) and Graph Embedding (GE) in unraveling object clusters while preserving class relationships.
While the group has had success in the use of NLDR for the automated classification of prostate magnetic resonance spectroscopy (MRS) data as well as protein and gene expression data, methods such as LLE and GE are sensitive to the choice of parameters. NLDR schemes attempt to preserve geodesic distances between objects from the high- to the low-dimensional spaces unlike PCA which preserves Euclidean distances. Methods such as LLE estimate object distances by assuming that within a small local neighborhood objects are linearly related. The geodesic estimate is thus a function of the size of the neighborhood within which local linearity is assumed. These NLDR schemes are also sensitive to the high dimensional feature space within which geodesic distances are computed since relative object adjacencies may change from one feature space to another. As an example, consider a feature vector F1(u) associated with each object u in a set C. Let the lower dimensional co-ordinates of three objects u, v, wεC, based on F1, be given by X1(u). X1(v). X1(w), where X1 is the principal Eigenvector obtained via application of NLDR to F1. Let's assume that of the 3 objects, u, v. belong to class ω1 while w belongs to class ω2. Assuming that the data has been properly projected into the lower dimensional space, then it should follow that ∥X1(u)−X1(v)∥2<∥X1(u)−X1(w)∥2, where ∥·∥2 represents the Euclidean norm. Note that the above is true only if F1 accurately encapsulates the class related information regarding u, v, wεC. However, this may not hold for another feature set F2 which on account of noisy or missing attributes may result in low dimensional projections in X2 such that ∥X2(u)−X2(v)∥2>∥X2(u)−X2(w)∥2. In order to represent the true relationship between u, v, w, the adjacency between objects in these lower dimensional embedding spaces is then represented as a function of the distance between the objects along the lower-dimensional manifold. In this Example a scheme is disclosed, wherein multiple such representations are combined to obtain a stable embedding representing the true class relationship between objects in high dimensional space. Analogous to classifier ensemble schemes for creating strong stable classifiers by combining multiple weak unstable classifiers with large bias and variance, the consensus embedding scheme will yield a more stable data embedding by reducing the variance in the individual embedding spaces. This is done by computing a consensus distance matrix W which reflects the averaged relative object adjacencies between u, v, wεC in multiple low dimensional data projections. Multidimensional scaling (MDS) is applied to W to obtain the final stable data embedding. Consensus clustering is then applied to segregate objects into distinct categories in an unsupervised fashion.
Experimental Design
Data Description and Notation
A total of 18 1.5 T in vivo endorectal MRI and MRS datasets were collected from the American College of Radiology Imaging Network (ACRIN) multi-site prostate trial2. For each patient, MRI data (T2 imaging protocol) was acquired prior to radical prostatectomy. Following resection, the gland was quartered and stained. These sections were then manually examined for CaP to constitute the ground truth on histology. These regions were then manually mapped onto the MRI images to estimate CaP presence and extent.
We define a 3D MRI scene C=C, ƒ) where C is a set of spatial locations ciεC, iε{1 . . . |C|}, |C| is the cardinality of any set C and ƒ(c) is a function that assigns an intensity value to every cεC. This 3D image at MRS metavoxel resolution is defined as Ĉ=(Ĉ, {circumflex over (ƒ)}), where Ĉ is a 3D grid of metavoxels at locations ĉĵεĈ, îε{1, . . . |Ĉ|}. It is important to note that the distance between any two adjacent metavoxels ĉî. ĉĵεĈ, ∥ĉî−ĉĵ∥2, (where ∥·∥2 denotes the L2 norm) is roughly 16 times the distance between any two adjacent voxels ci, cjεC. Accordingly {circumflex over (ƒ)}(ĉî), ∀ĉîεĈ is defined.
Determination of Approximate Ground Truth for CaP on MRI
Partial ground truth for the 1.5 T MR datasets in the ACRIN database is available in the form of approximate sextant locations and sizes of cancer for each dataset as described previously. In the previous Examples, an algorithm for the registration of ex vivo MRI and whole-mount histological (WMH) images were developed for accurate mapping of the spatial extent of CaP from WMH sections onto MRI. However most of the histology data in the ACRIN study are not WMH, but small sections of the gland which makes it difficult for them to be reconstituted into WMH sections. Hence the CaP ground truth estimate on the MRI sections is obtained in the following manner. The MR image of the prostate is visually divided into two lateral compartments: Left (L) and Right (R); and further divided into 3 regions longitudinally: Base (B), Midgland (M) and Apex (A). Presence of CaP (potential cancer space) has been previously determined in one or more of these six regions: Left Base (LB), Left Midgland (LM), Left Apex (LA), Right Base (RB), Right Midgland (RM) and Right Apex (RA) via manual mapping of CaP from histology onto the corresponding MRI sections. The maximum diameter of the tumor is also recorded in each of the 6 candidate locations and is denoted as MaxDiameter. The total number of possible cancer voxels cεC at the MR voxel resolution within the cancer space is given as:
where ┌┐ refers to the ceiling operation and Δx, Δy refer to the dimensions of the MR voxel c in the X and Y dimensions. Similarly, the number of possible cancer metavoxels ĉεĈ at the MRS metavoxel resolution was calculated as:
where Δ{circumflex over (x)}, Δŷ refer to the dimensions of the MRS metavoxel ĉ in the X and Y dimensions. Note that the exact spatial location of CaP voxels on a particular slice is not available, only the size and sextant within which it occurs. This potential cancer space nonetheless serves as a basis to perform a semi-quantitative evaluation of the CAD scheme.
Correcting Bias Field and Non-Linear MR Image Intensity Artifacts
Image intensity variations due to radio frequency (RF) field inhomogeneities may be caused due to a number of different factors including poor RF field uniformity, static field inhomogeneity and RF penetration. The ITK toolkit's BiasCorrector algorithm was used to correct the original 3D MR scenes for bias field inhomogeneity and an interactive version of the image intensity standardization algorithm previously presented to correct for non-linearity of image intensities.
Image intensity standardization is a post-MR acquisition processing operation designed for correcting inter-acquisition signal intensity variations (non-standardness). Such grayscale intensities do not have a fixed tissue-specific meaning within the same imaging protocol, the same body region, or even within the same patient. When the histograms for different prostate studies (in different colors) are plotted together (as seen in
Note that before intensity standardization as well as for the result of the linear technique, the intensity histograms are misaligned (
Feature Extraction
Over 350 3D texture feature scenes, corresponding to three different texture classes were extracted from each MRI scene. These feature representations were chosen since they have been demonstrated to be able to discriminate between the cancer and non-cancer classes. The feature scenes u=(C, ƒu) for each C are calculated by applying the feature operators Φu, uε{1, . . . , 373} within a local neighborhood associated with every cεC. Hence ƒu(c) is the feature value associated with feature operator Φu at voxel c. Therefore, it is possible to define a feature vector associated with each cεC as F(c)=[ƒu(c)|uε{1, . . . 373}]. A κ-neighborhood centered on ciεC as Nκ(ci) is defined, where ΛcjεNκ(ci), ∥cj−ci∥≦κ, i, jε{1 . . . , |C|}, ci∉Nκ(ci). Similarly, a κ-neighborhood Nκ(ĉî) for ĉîεĈ is defined, where ΛcjεNκ(ĉî), ∥cj−ĉî∥≦κ, îε{1, . . . , |Ĉ|},jε{1, . . . , |C|}, cj≠ĉî. Based on the definition provided herein for a metavoxel, it is possible to similarly define a feature attribute for each metavoxel ĉεĈ as the median {circumflex over (ƒ)}u(ĉî)=MEDIANc
Gradient Features
Gradient features are calculated using steerable and non-steerable linear gradient operators. Eleven non-steerable gradient features were obtained using Sobel, Kirsch and standard derivative operations. Gabor gradient operators comprising the steerable class of gradient calculations were defined for every cεC where c=(x, y, z),
where ω is the frequency of a sinusoidal plane wave along the X-axis, and σX, σY, and σZ are the space constraints of the Gaussian envelope along the X, Y, and Z directions respectively. The orientation of the filter, θ, is affected by the coordinate transformations: x1=r(x cos θ+y sin θ), y1=r(−x sin θ+y cos θ) and z1=r(z), where r is the scaling factor. These were computed within the sliding window neighborhood Nκ. Gabor gradient features were calculated at 13 scales
6 orientations
and 3 window sizes (κε{3, 5, 7}).
on a 2D slice from C (
First Order Statistical Features
Four first order statistical features for 3 different window sizes were calculated. They included the mean, median, standard deviation, and range for the gray values of voxels within the sliding window neighborhood Nκ, κε{3, 5, 7}.
Second Order Statistical Features
To calculate the second order statistical (Haralick) feature scenes, AG×G co-occurrence matrix Pd,c,κ associated with Nκ(c) is calculated, where G is the maximum gray scale intensity in C. The value at any location [g1, g2] in Pd,c,κ, where g1, g2ε{1, . . . , G}, represents the frequency with which two distinct voxels ci, cjεNκ(c), i, jε{1, . . . , |C|} with associated image intensities ƒ(ci)=g1, ƒ(ci)=g2 are separated by distance d. A total of 13 Haralick features including energy, entropy, inertia, contrast, correlation, sum average, sum variance, sum entropy, difference average, difference variance, difference entropy, local homogeneity and average deviation were extracted at every voxel cεC, based on Pd,c,κ, for κε{3, 5. 7}, d=1 and Gε{64, 128, 256}.
Non Linear Dimensionality Reduction
Graph Embedding
The aim of graph embedding is to find an embedding vector XGE(c), ΛcεC such that the relative ordering of the distances between objects in the high dimensional feature space is preserved in lower dimensional space. Thus, if ci, cjεC, i, jε{1, . . . , |C|} are close in high dimensional feature space, then ∥XGE(ci)−XGE(cj)∥2 should be small, where ∥·∥2 represents the Euclidean norm. This will only be true if the distances between all ci, cjεC are preserved in the low dimensional mapping of the data. To compute the optimal embedding, first a matrix WGEεC|×|C| is defined, representing the adjacency between all objects cεC in high-dimensional feature space. For all ci, cjεC, WGE is defined as
WGE(i,j)=e−∥F(c
XGE(c) is then obtained from the maximization of the function:
where tr is the trace operator, XGE=[XGE(c1), X(c2) . . . XGE, (cn)], n=|C| and γ=n−1. Additionally, D is a diagonal matrix where for all cεC, the diagonal element is defined as D(i, i)=ΣjWGE(i, j). The embedding space is defined by the Eigenvectors corresponding to the smallest β Eigenvalues of (D−WGE)XGE=λDWGE. The matrix XGEε|C|×β of the first β Eigenvectors is constructed, and ΛciεC, XGE(ci) is defined as row i of XGE. XGE(ci) is therefore a vector consisting of element number i from each of the first β Eigenvectors, which represents the β-dimensional Cartesian coordinates.
Locally Linear Embedding (LLE)
LLE operates by assuming that objects in a neighborhood of a feature space are locally linear. Consider the set of feature vectors ={F(c1), F(c2) . . . F(cn)}, n=|C|. It is desired to map the set to the set χ={XLLE(c1), XLLE(c2), . . . , XLLE(cn)} of embedding co-ordinates. For all objects cεC, LLE maps the feature vector F(c) to the embedding vector XLLE(c). Let {cn
Having determined the weighting matrix WLLE, the next step is to find a low-dimensional representation of the points in that preserves this weighting. Thus, for each F(ci) approximated as the weighted combination of its kNN, its projection XLLE(ci) will be the weighted combination of the projections of these same kNN. The optimal χLLE in the least squares sense minimizes
where tr is the trace operator, χLLE=[XLLE(c1), XLLE(c2) . . . XLLE(cu)], L=(I−WLLE)(I−WLLET) and I is the identity matrix. The minimization of (2.5.4) subject to the constraint χLLEχLLET=I (a normalization constraint that prevents the solution χLLE≡0) an Eigenvalue problem whose solutions are the Eigenvectors of the Laplacian matrix L. Since the rank of L is n−1 the first Eigenvector is ignored and the second smallest Eigenvector represents the best one-dimensional projection of all the samples. The best two-dimensional projection is given by the eigenvectors with the second and third smallest eigenvalues, and so forth.
Consensus Embedding to Obtain Stable Low Dimensional Data Representation
We require a lower dimensional embedding that models the true nature of the underlying manifold that is described in high dimensional space. Varying the feature subspaces of the high dimensional manifold and the parameters (e.g. the number of k nearest neighbors in LLE) associated with NLDR methods achieves multiple embeddings which individually model relationships between objects. Disclosed herein, is a novel method to obtain this representation by generating multiple lower dimensional embeddings of feature subspaces and capturing the adjacencies between the voxels in the lower dimensional spaces. These adjacencies can then be combined to yield a more stable representative embedding. Multiple embeddings Xφ,α(c) for cεC are generated, based on feature subspaces Fα(c)F(c), αε{1, . . . , B} using the NLDR schemes φε{GE, LLE} described earlier. Each embedding Xφ,α, will hence represent adjacencies between voxels ci. cjεC based on the feature sub-space Fα. Thus, ∥Xφ,α(ci)−Xφ,α(cj)∥2 will vary as a function of Fα. To represent the true adjacency and class relationship between ci, cjεC there is a need to combine the multiple embeddings Xφ,α. A confusion matrix Wφ,αε|C|×|C| based on representing the adjacency between any two voxels ci, cjεC in the lower dimensional embedding representation Xφ,α is first calculated as:
Wφ,α(i,j)=∥Xφ,α(ci)−Xφ,α(cj)∥2. (99.66.1)
where ci, cjεC, i, jε{1, . . . , |C|}, φε{GE, LLE}, αε{1 . . . , B}. The confusion matrix Wφ,α will hence represent the relationships between the voxels in each of the B embedding spaces Xφ,α, obtained via Fα, αε{1, . . . , B}. These voxel adjacencies can be averaged as
where {tilde over (W)}φ(i, j) represents the average distance in the reduced dimensional space over B feature sets Fα between the voxels c1, cjεC. The idea is that not every Fα will represent the true class relationship between ci, cjεC; hence {tilde over (W)}φ(i, j) is a more reliable estimate of the true embedding distance between ci, cj. Multidimensional scaling (MDS) is then applied to this {tilde over (W)}φ to achieve the final combined embedding {tilde over (X)}φ. MDS is implemented as a linear method that preserves the Euclidean geometry between each pair of voxels ci, cjεC. This is done by finding optimal positions for the data points ci, cj in lower-dimensional space through minimization of the least squares error in the input pairwise Euclidean distances in {tilde over (W)}φ. The complete algorithm for the consensus embedding scheme is described below:
Consensus Clustering on the Consensus Embedding Space
To overcome the instability associated with centroid based clustering algorithms, multiple weak clusterings Vφ,t1, Vφ,t2, Vφ,t3, tε{0, . . . , T} is generated by repeated application of k-means clustering on the combined low dimensional manifold {tilde over (X)}φ(c), for all cεC and φε{GE, LLE}. Each cluster Vφ,t is a set of objects which has been assigned the same class label by the k-means clustering algorithm. As the number of elements in each cluster tends to change for each such iteration of k-means, a co-association matrix Hφ was calculated with the underlying assumption that voxels belonging to a natural cluster are very likely to be co-located in the same cluster for each iteration. Co-occurrences of pairs of voxels ci, cj in the same cluster Vφ,t are hence taken as votes for their association. Hφ(i,j) thus represents the number of times ci, cjεC were found in the same cluster over T iterations. If Hφ(i, j)=T then there is a high probability that ci, cj do indeed belong to the same cluster. MDS is applied to Hφ followed by a final unsupervised classification using k-means, to obtain the final stable clusters {tilde over (V)}φ1, {tilde over (V)}φ2, {tilde over (V)}φ3. The algorithm is described below:
Results and Discussion
Qualitative Results
Our scheme was applied to 18 1.5 T datasets. The CAD analysis was done at both voxel and metavoxel resolutions (C and Ĉ).
1.5 T MR Data at Metavoxel Resolution
1.5 T MR Data at Voxel Resolution
Quantitative Evaluation
1.5 T MR Data at Metavoxel Resolution
We have already determined the counts of metavoxels lying in the potential cancer space for each dataset. The counts of the metavoxels in each of the clusters {circumflex over (V)}φ1. {circumflex over (V)}φ2. {circumflex over (V)}φ3. φε{GE. LLE} are compared with this ground truth value. Sensitivity, specificity and positive predictive values (PPV) for each of {circumflex over (V)}φ1, {circumflex over (V)}φ2, {circumflex over (V)}φ3 are obtained. The cluster with the highest sensitivity, specificity and PPV is then determined as the cancer class. These results are then averaged over 18 datasets and are summarized in Table 1. The results suggest that GE yields a marginally better sensitivity compared to LLE, but LLE has higher specificity and PPV. Note that with both GE and LLE the average detection sensitivity and specificity are 80% or higher.
1.5 T MR Data at Voxel Resolution
Similar to the manner used to evaluate the metavoxel resolution results, the counts of voxels lying in the potential cancer space were already determined for each dataset. The counts of the voxels in each of the clusters {tilde over (V)}φ1, {tilde over (V)}φ2, {tilde over (V)}φ3, φε{GE} are compared with this ground truth value. Sensitivity, specificity and positive predictive values (PPV) for each of {tilde over (V)}φ1, {tilde over (V)}φ2, {tilde over (V)}φ3 are obtained. The cluster with the highest sensitivity, specificity and PPV is then determined as the cancer class. These results are then averaged over 12 datasets and are summarized in Table 1. Note that due to the much higher spatial resolution of the data analyzed the performance measures at voxel resolution are higher compared to those at metavoxel resolution.
Recently researchers have been developing computer-aided diagnosis (CAD) methods for CaP detection on MRI and MRS alone. In previous Examples a novel supervised CAD scheme for detection of CaP from 4 T ex vivo prostate MRI was disclosed. A multi-attribute classifier trained via CaP extent on MRI determined from corresponding whole mount histology specimens was used to generate a likelihood scene in which the intensity at every spatial location corresponded to the probability of CaP being present. A multichannel statistical classifier, was suggested, which used multi-protocol 1.5 T in vivo MRI to generate a statistical probability map CaP occurrence within the prostate. The quantification of prostate MRS by model based time fitting was also worked on as well as frequency domain analysis. However automated peak finding algorithms suffer from problems associated with the noisy data which worsens when a large baseline is present along with low signal to noise ratio. z-score (ratio of difference between population mean and individual score to the population standard deviation) analysis was suggested as an automated technique for quantitative assessment of 3D MRSI data for glioma. Disclosed herein above is an automated scheme for detection of CaP using in vivo prostate MRS data alone which applied hierarchical clustering and manifold learning to successfully differentiate between cancerous and non-cancerous regions in the prostate.
The Example described herein represents the first attempt to integrate MRS and MRI prostate data in a quantitative manner for CaP detection. Previous related work has focused on the qualitative examination of the modalities individually and combining the results of such findings to arrive at a consensual decision about the severity and extent of CaP. Quantitative integration of multimodal image data (PET and MRI) is simply the concatenation of image intensities following registration. Comparatively quantitative integration of heterogeneous modalities such as MRS and MRI involves combining physically disparate sources such as image intensities and spectra and is therefore a challenging problem. As an example consider a multimodal MR scene C, where for every location c, spectral and image information exists. Let S(c) denote the spectral MRS feature vector at every location c and let ƒ(c) denote the associated intensity value for this location on the corresponding MRI image. Building a meta-classifier for CaP detection by concatenating S(c) and ƒ(c) together at location c is not a solution owing to (i) different dimensionalities of S(c) and ƒ(c) and (ii) the difference in physical meaning of S(c) and ƒ(c) To improve discriminability between CaP and benign regions in prostate MRI a large number of texture features are extracted from the MRI image. This feature space will form a texture feature vector F(c) at every location c. The spectral vector S(c) and texture feature vector F(c) will each form a high dimensional feature space at every location c in a given multimodal scene C.
By means of non-linear dimensionality reduction (NLDR), the heterogeneity of these differing sets of information is overcome, as well as inherent non-linearities within this data. Such non-linearities were shown not to be taken into account by linear dimensionality reduction schemes such as principal component analysis (PCA). Use is made of graph embedding (GE) and locally linear embedding (LLE) to project the individual high dimensional MRI and spectral features into a lower dimensional embedding space, while preserving class and object relationships from these high dimensional feature spaces in their resultant lower dimensional embedding spaces. On account of the aforementioned reasons, physically combining S(c) and F(c) in the high dimensional space is not feasible; the goal is to integrate S(c) and F(c) in a lower dimensional embedding space where S(c) and F(c) are now represented by their corresponding embedding vectors XS(c) and XF(c). Note that since XS(c) and XF(c) represent embedding co-ordinates which are divorced from any physical meaning and thus they can be reconciled into an integrated feature vector XF,S(c) and be used to build a MRI, MRS meta-classifier for CaP detection. In order to analyze the result of integrating multimodal MR data as compared to using the individual modalities, unsupervised consensus k-means clustering is used to partition all the spatial locations cεC into cancerous, benign, or other tissue classes based on the corresponding XF(c), XS(c), XF,S(c) values. This scheme was applied for CaP detection on a total of 16 in vivo prostate MRI and MRS datasets. Ground truth estimates of CaP on partial whole mount histological sections were available which were used to define CaP location on MRI which was then used for quantitative evaluation. the hypothesis is that the MRS, MRI meta-classifier XF,S(c) will provide higher CaP detection, sensitivity, and specificity compared to classifiers based on the individual modalities (XF(c) and XS(c)).
System Overview
Data Description and Notation
A total of 16 1.5 T in vivo endorectal MRI and MRS studies were obtained from the American College of Radiology Imaging Network (ACRIN) multi-site prostate trial3. For each patient, MR data (T2 imaging protocol) was acquired prior to radical prostatectomy. Following resection, the gland was quartered and stained. These sections were then manually examined for CaP to constitute the ground truth on histology. CaP locations on MRI were then manually obtained.
We define a 3D MRI scene C=(C, ƒ) where C is a set of spatial locations ciεC, iε{1, . . . |C|}, |C| is the cardinality of any set C and ƒ(c) is a function that assigns an intensity value to every cεC. A 3D spectral scene is defined as Ĉ=(Ĉ, Ĝ), where Ĉ is a 3D grid of metavoxels superposed on C. For ever spatial location ĉîεĈ, îε{1, . . . |Ĉ|}, there an associated 256-dimensional valued spectral vector Ĝ(ĉi)=[ĝr(ĉî)|uε{1, . . . , 256}] where ĝr(ĉî) represents the concentration of different biochemicals (such as creatine, citrate, and choline).
Determination of Approximate Ground Truth for CaP on MRI, MRS
Partial ground truth for the CaP extent on MR studies in the ACRIN database is available in the form of approximate sextant locations and sizes for each study. An algorithm was previously disclosed (See e.g. PCT/US2008/0418, incorporated herein by reference in its entirety), for registration of ex vivo MRI and whole-mount histological (WMH) images for accurate mapping of spatial extent of CaP from WMH sections onto MRI. However most of the histology data in the ACRIN study are not WMH, but partial gland sections which are difficult to reconstitute. Hence the CaP ground truth estimate on the MRI sections is obtained in the following manner. The MR image of the prostate is visually divided into two lateral compartments: Left (L) and Right (R); and further divided into 3 regions longitudinally: Base (B), Midgland (M) and Apex (A). Presence of CaP (potential cancer space) has previously been determined in one or more of these six locations: Left Base (LB), Left Midgland (LM), Left Apex (LA), Right Base (RB), Right Midgland (RM), and Right Apex (RA) via manual mapping of CaP from histology onto the corresponding MRI sections. The maximum diameter of the tumor is also recorded in each of the 6 candidate locations and is denoted as MaxDiameter. The number of possible cancer metavoxels is calculated as:
where Δ{circumflex over (x)}, Δŷ refer to the dimensions of the metavoxel ĉεC in the X and Y dimensions. Note that the exact spatial location of these metavoxels on a particular slice is not available, only the size and sextant within which it occurs. This potential cancer space nonetheless serves as a basis for performing a semi-quantitative evaluation of the CAD scheme.
Brief Outline of Experimental Design
As illustrated in
Module 1: MRS Feature Extraction—The high dimensional spectral feature vector Ĝ(ĉ) is non-linearly projected into lower dimensional space via NLDR methods to form the low dimensional embedding of MRS data Ŷ(ĉ). The MRS embedding space forms one input to the data integration module.
Module 2: MRI Feature extraction—Texture features previously shown to be able to differentiate between cancerous and non-cancerous regions are used to extract a large texture feature space {circumflex over (F)}(ĉ) at every location ĉ. The consensus embedding method is used to project the data {circumflex over (F)}(ĉ) into embedding space. This novel scheme combines embeddings of multiple feature subspaces of the MRI feature data to achieve a more stable embedding solution. The consensus MRI embedding space {tilde over (X)}(ĉ) forms the second input to the data integration module.
Module 3: Integration of MRI and MRS—A consolidated embedding vector {tilde over (E)}(ĉ) for each ĉεĈ is created by concatenating embedding coordinates {tilde over (Y)}(ĉ) and {tilde over (X)}(ĉ).
Module 4: Classification of integrated MRS and MRI feature spaces via consensus clustering—Unsupervised consensus clustering is applied to partition all objects ĉεĈ into one of 3 classes. The clustering scheme is applied individually to {tilde over (X)}(ĉ), {tilde over (Y)}(ĉ), and {tilde over (E)}(ĉ). The cluster with the largest overlap (sensitivity and specificity) with respect to the potential cancer space is identified as the CaP class.
Feature Extraction Methods for MRS and MRI
MRS Feature Extraction
Most automated MRS classification schemes are based on calculating ratios of the discriminatory peaks (choline/creatine) in MRS which involve processes of peak detection and noise suppression. Application of such methods to prostate MRS is complicated due to poor signal quality and noise issues. Instead the MR spectrum is considered in its totality. The 256-point MRS spectrum Ĝ(ĉ) associated with every metavoxel ĉεĈ is non-linearly projected into a lower dimensional space using two NLDR methods, φε{GE,LLE} to calculate the MRS embedding vector {tilde over (Y)}(ĉ) associated with every ĉεĈ. {tilde over (Y)}φ(ĉ) now represents the MRS feature vector at ĉεĈ.
Graph Embedding
The aim of graph embedding is to find an embedding vector YGE(ĉ), ΛĉεĈ such that the relative ordering of the distances between objects in the high dimensional feature space is preserved in lower dimensional space. Thus, if locations ĉî, ĉĵεĈ, î, ĵε{1, . . . , |Ĉ|} are close in high dimensional feature space, then ∥YGE(ĉî)−YGE(ĉĵ)∥2 should be small, where ∥·∥2 represents the Euclidean norm. However this is only true if the distances between all ĉî, ĉĵεĈ are preserved in the low dimensional mapping of the data. To compute the optimal embedding, first a matrix WGEε|Ĉ|×|Ĉ| is defined, representing the adjacency between all objects ĉεĈ in high-dimensional feature space. For all ĉî, ĉĵεĈ, WGE is defined as
WGE(î,ĵ)=e−|F(ĉ
YGE(ĉ) is then obtained from the maximization of the function:
where tr is the trace operator γGE=[YGE(ĉ1), YGE(ĉ2), . . . , YGE(ĉn)], n=|Ĉ| and γ=n−1. Additionally, D is a diagonal matrix where for all ĉεĈ, the diagonal element is defined as D(î,î)=ΣĵWGE(î, ĵ). The embedding space is defined by the Eigenvectors corresponding to the smallest β Eigenvalues of (D−WGE)γGE=λDWGE. The matrix γGEε|Ĉ|×β of the first β Eigenvectors is constructed, and ΛĉîεĈ,YGE(ĉî) is defined as row î of γGE, YGE(ĉî) is therefore a vector consisting of element number î from each of the first β Eigenvectors and represents the β-dimensional embedding coordinates.
Locally Linear Embedding (LLE)
LLE operates by assuming that objects in a neighborhood of a feature space are locally linear. Consider the set of feature vectors ={F(ĉ1), F(ĉ2), . . . F(ĉn)}, n=|Ĉ|. It is desired to map the set to the set γ={YLLE(ĉ1), YLLE(ĉ2), . . . YLLE(ĉn)} of embedding co-ordinates. For all objects ĉεĈ, LLE maps the feature vector F(ĉ) to the embedding vector YLLE(ĉ). Let {ĉη
Having determined the weighting matrix WLLE, the next step is to find a low-dimensional representation of the points in that preserves this weighting. Thus, for each F(ĉî) approximated as the weighted combination of its kNN, its projection YLLE(ĉî) will be the weighted combination of the projections of these same kNN. The optimal YLLE in the least squares sense minimizes
where tr is the trace operator, γLLE=[YLLE(ĉ1), YLLE(ĉ2), . . . , YLLE(ĉn)], L=(I−WLLE)(I−WLLET) and I is the identity matrix. The minimization of (3.1.4) subject to the constraint γLLEγLLET=I (a normalization constraint that prevents the solution γLLE=0) is an Eigenvalue problem whose solutions are the Eigenvectors of the Laplacian matrix L. Since the rank of L is n−1 the first Eigenvector is ignored and the second smallest Eigenvector represents the best one-dimensional projection of all the samples. The best two-dimensional projection is given by the eigenvectors with the second and third smallest eigenvalues, and so forth.
MRI Feature Extraction
Texture Representation of CaP on MRI
Over 350 3D texture feature scenes corresponding to three different texture classes were extracted from each MRI scene. These feature representations were chosen since they have been demonstrated to be able to discriminate between the cancer and non-cancer classes. The feature scenes u=(C. ƒu) are calculated for each C by applying the feature operators Φu. uε{1, . . . , 373} within a local neighborhood associated with every cεC. Hence ƒu(c) is the feature value associated with feature operator Φu at voxel c. Therefore, it is possible to define a feature vector associated with each cεC as F(c)=[ƒu(c)|uε{1, . . . 373}]. A κ-neighborhood centered on ciεC is defined as Nκ(ci) where ΛcjεNκ(ci), ∥cj−ci∥≦κ, i, jε{1, . . . , |C|}, ci∉Nκ(ci). Similarly, a κ-neighborhood Nκ(ĉî) is defined for ĉîεĈ where for all cjεNκ(ĉî), ∥cj−ĉî∥≦κ, îε{1, . . . |Ĉ|}, jε{1 . . . , |C|}, cj≠ĉî. It is possible to define a feature attribute for each metavoxel ĉεĈ as the median {circumflex over (ƒ)}u(ĉî)=MEDIANc
Gradient Features
Gradient features are calculated using steerable and non-steerable linear gradient operators. Eleven non-steerable gradient features were obtained using Sobel, Kirsch and standard derivative operations. Gabor gradient operators comprising the steerable class of gradient calculations were defined for every cεC where c=(x, y, z),
where ω is the frequency of a sinusoidal plane wave along the X-axis, and σX, σY, and ΥZ are the space constraints of the Gaussian envelope along the X, Y, and Z directions respectively. The orientation of the filter, θ, is affected by the coordinate transformations: x1=r(x cos θ+y sin θ), y1=r (−x sin θ+y cos θ) and z1=r(z) where r is the scaling factor. These were computed within the sliding window neighborhood Nκ. Gabor gradient features were calculated at 13 scales
6 orientations
and 3 window sizes (κε{3, 5, 7}).
on a 2D section from C (
First Order Statistical Features
Four first order statistical features for 3 different window sizes were calculated. They included the mean, median, standard deviation, and range for the gray values of pixels within the sliding window neighborhood Nκ, κε{3, 5, 7}.
Second Order Statistical Features
To calculate the second order statistical (Haralick) feature scenes, a M×M co-occurrence matrix Pd,c,κ associated with Nκ(c) is computed, where M is the maximum gray scale intensity in C. The value at any location [m1, m2] in Pd,c,κ, where m1. m2ε{1, . . . M}, represents the frequency with which two distinct voxels ci, cjεNκ(c), i, jε{1, . . . , |Ĉ|} with associated image intensities ƒ(ci)=m1, ƒ(cj)=m2 are separated by distance d. A total of 13 Haralick features including energy, entropy, inertia, contrast, correlation, sum average, sum variance, sum entropy, difference average, difference variance, difference entropy, local homogeneity and average deviation were extracted at every voxel cεC, based on Pd,c,κ, for κε{3, 5, 7}, d=1 and Mε{64, 128. 256}. FIG. 14(d) shows a feature image (energy) extracted from the co-occurrence matrix (κ=3. G=61, d=1).
Consensus Embedding for Feature Extraction
We require a lower dimensional embedding that models the true nature of the underlying manifold that is described in high dimensional space based on relationships between the objects in this space. Varying the feature subspaces of the high dimensional manifold and the parameters (e.g. the number of k nearest neighbors in LLE) associated with NLDR methods achieves multiple embeddings which individually model relationships between these objects. Disclosed herein is a novel method to obtain a stable low dimensional data representation which integrates estimates of object adjacency from multiple lower dimensional representations of the data. Multiple embeddings Xφ,α(ĉ) are generated for ĉεĈ, based on feature subspaces Fα(ĉ){circumflex over (F)}(ĉ), αε{1 . . . , B} using the NLDR schemes φε{GE,LLE} described hereinabove. Each embedding Xφ,α will hence represent adjacencies between metavoxels ĉî. ĉĵεĈ based on the feature subspace Fα. Thus ∥Xφ,α(ĉî)−Xφ,α(ĉĵ)∥2 will vary as a function of Fα. To represent the true adjacency and class relationship between ĉî. ĉĵεĈ there is a need to combine the multiple embeddings Xφ,α. A confusion matrix Wφ,αε|Ĉ|×|Ĉ based on representing the adjacency between any two metavoxels at locations ĉî, ĉĵεĈ in the lower dimensional embedding representation Xφ,α is first calculated as
:Wφ,α(î,ĵ)=∥Xφ,α(ĉî)−Xφ,α(ĉĵ)∥2. (12.2.2)
where ĉî, ĉĵεĈ, î, ĵε{1 . . . , |Ĉ|}.φε{GE, LLE}, αε{1 . . . B}. The confusion matrix it Wφ,α will hence represent the relationships between the metavoxels in each of the B embedding spaces Xφ,α, obtained via Fα, αε{1, . . . , B}. These metavoxel adjacencies can be averaged as:
where {tilde over (W)}φ(î, ĵ) represents the average distance in the reduced dimensional space over B feature sets Fα between the metavoxels at locations ĉî, ĉĵεĈ. Since it is not guaranteed every Fα will reflect the true class relationship between ĉî, ĉĵεĈ, the assumption is that the average object adjacency {tilde over (W)}φ is a truer representation of the true embedding distance between ĉî, ĉĵ. Multidimensional is applied to {tilde over (W)}φ to achieve the final combined embedding {tilde over (X)}φ. MDS is implemented as a linear method that preserves the Euclidean geometry between each pair of metavoxels at ĉî, ĉĵεĈ. This is done by finding optimal positions for the data points ĉî,ĉĵ in lower-dimensional space through minimization of the least squares error in the input pairwise Euclidean distances {tilde over (W)}φ. The result is lower dimensional embedding vector for the MRI feature space {tilde over (X)}φ(ĉ), ΛĉεĈ, and φε{GF. LLE}.
Data Integration and Classification
Multimodal Data Integration of MRI and MRS
Owing to the physical differences in the MRS and MRI features the MRS, MRI meta-classifier is created in the joint MRI and MRS embedding space where the physicality of the object features has been removed. Thus while it was not possible to physically concatenate the MRS and MRI features in the original high dimensional space, a direct concatenation of the MRI and MRS embedding coordinates can be done since objects ĉî and ĉĵ that are adjacent in {tilde over (X)} and {tilde over (Y)} should also be adjacent in the combined embedding space. The integration of the embedding spaces is predicated on the fact that they are identically scaled. A combined embedding vector {tilde over (E)}φ(ĉ)=[{tilde over (X)}φ(ĉ), {tilde over (Y)}φ(ĉ)] is obtained at each ĉεĈ by direct concatenation of {tilde over (X)}φ(ĉ) and {tilde over (Y)}φ(ĉ). For every ĉεĈ, {tilde over (X)}φ(ĉ)ε1×3 and {tilde over (Y)}φ(ĉ)ε1×3, a {tilde over (E)}φ(ĉ)ε1×23 exists, where β is the number of Eigenvectors calculated in projecting the data via NLDR.
Consensus κ-Means Clustering for Final Classification
We now have the 3 vectors associated with each metavoxel ĉεĈ, specifically (i) the lower dimensional embedding obtained from MRI data [{tilde over (X)}φ(ĉ)], (ii) the lower dimensional embedding obtained from MRS data [{tilde over (Y)}φ(ĉ)] and (iii) the integrated lower dimensional embedding [{tilde over (E)}φ(ĉ)]. To overcome the instability associated with centroid based clustering algorithms, multiple weak clusterings Zψ,φ,t1. Zψ,φ,t2. Zψ,φ,t3. tε{0 . . . , T} are generated by repeated application of k-means clustering on each of the low dimensional manifolds ψε{{tilde over (X)}, {tilde over (Y)}, {tilde over (E)}} each calculated using NLDR methods φε{GE, LEE} for all ĉεĈ. Each cluster {tilde over (Z)}ψ,φ is a set of objects which has been assigned the same class label by the k-means clustering algorithm. As the number of elements in each cluster tends to change for each such iteration of k-means, a co-association matrix IIψ,φ is calculated, with the underlying assumption that metavoxels belonging to a natural cluster are very likely to be co-located in the same cluster for each iteration. Co-occurrences of pairs of metavoxels ĉî, ĉĵ in the same cluster Zψ,φ,t are hence taken as votes for their association. Zψ,φ(î, ĵ) thus represents the number of times ĉî, ĉĵεĈ were found in the same cluster over T iterations. If Zψ,φ(î, ĵ)=T then there is a high probability that ĉî, ĉĵ do indeed belong to the same cluster. MDS is applied to Hψ,φ followed by a final unsupervised classification using k-means, to obtain the final stable clusters {tilde over (Z)}ψ,φ1. {tilde over (Z)}ψ,φ2. {tilde over (Z)}ψ,φ3 for each of ψε{{tilde over (X)}, {tilde over (Y)}, {tilde over (E)}} and φε{GE, LLE}.
Results
Qualitative Results
The scheme disclosed herein was applied on 16 1.5 T datasets at metavoxel resolution (Ĉ). The results from consensus clustering of the lower dimensional embedding from the individual modalities ({tilde over (X)}φ(ĉ) and {tilde over (Y)}φ(ĉ)) were compared with the embeddings created via the combination scheme ({tilde over (E)}φ(ĉ)).
Quantitative Results
We have already determined the counts of metavoxels lying in the potential cancer space for each dataset at the metavoxel resolution. The counts of the metavoxels in each of the clusters {tilde over (Z)}ψ,φ1, {tilde over (Z)}ψ,φ2, {tilde over (Z)}ψ,φ3, ψε{{tilde over (X)}, {tilde over (Y)}, {tilde over (E)}}, φε{GE, LLE} are compared with this ground truth value. Sensitivity, specificity and positive predictive values (PPV) for each of {tilde over (Z)}ψ,φ1, {tilde over (Z)}ψ,φ2, {tilde over (Z)}ψ,φ3 are obtained. The cluster with the highest sensitivity, specificity and PPV is identified as the cancer class. These results are then averaged over 16 datasets and are summarized in Table 1. The results appear to suggest that integration of MRI and MRS performs marginally better or comparably to MRS or MRI alone. Possible reasons that more obvious differences were not observed may have to do with the lack of precise ground truth data for CaP extent. Note that LLE shows a high sensitivity in each case as compared to using graph embedding, but that GE shows a higher specificity and PPV.
Most current efforts in computer-aided diagnosis of CaP from DCE-MRI involve pharmacokinetic curve fitting such as in the 3 Time Point (3TP) scheme. Based on the curve/model fits these schemes attempt to identify wash-in and wash-out points, i.e. time points at which the lesion begins to take up and flush out the contrast agent. Lesions are then identified as benign, malignant or indeterminate based on the rate of the contrast agent uptake and wash out. A supervised CAD scheme was disclosed for analysis of the peripheral zone of the prostate. Pharmacokinetic features derived from curve fitting were used to train the model and coarse quantitative evaluation was performed based on a roughly registered spatial map of CaP on MRI. Area under the Receiver Operating Characteristic (ROC) curve (AUC) was used as a measure of accuracy. A mean AUC of 0.83 was reported. Due to the lack of perfect slice correspondences between MRI and histology data and the large difference in the number of slices between the two modalities, training a supervised classification system based on such labels is inappropriate.
The 3TP and pharmacokinetic modeling approaches assume linear changes in the dynamic MR image intensity profiles. Such data suffers from intensity non-standardness, wherein MR image intensities do not have fixed tissue-specific meaning within the same imaging protocol, body region, and patient.
In this example presented is a comprehensive segmentation, registration and detection scheme for CaP from 3 T in vivo DCE-MR imagery that has the following main features: (1) a multi-attribute active shape modelis used to automatically segment the prostate boundary, (2) a multimodal non-rigid registration scheme is used to map CaP extent from whole mount histological sections onto corresponding DCE-MR imagery, and (3) an unsupervised CaP detection scheme involving LLE on the temporal intensity profiles at every pixel location followed by classification via consensus clustering. The methodology disclosed is evaluated on a per-pixel basis against registered spatial maps of CaP on MRI. Additionally, the results obtained using the disclosed novel methodology are compared with those obtained from the 3TP method for a total of 21 histology-MRI slice pairs.
Experimental Design
Data Description and Notation
A total of 21 3 T in vivo endorectal MR (T2-weighted and DCE protocols) images with corresponding whole mount histological sections (WMHS) following radical prostatectomy were obtained from 6 patient datasets from the Beth Israel Deaconess Medical Center. The DCE-MR images were acquired during and after a bolus injection of 0.1 mmol/kg of body weight of gadopentetate dimeglumine using a 3-dimensional gradient echo sequence (3D-GE) with a temporal resolution of 1 min 35 sec. Following radical prostatectomy, whole-mount sections of the prostate were stained via Haemotoxylin and Eosin (H & E) and examined by a trained pathologist to accurately delineate the presence and extent of CaP.
We define a 2D DCE-MR image CD,t=(C, ƒD,t) where C is a set of spatial locations ciεC. iε{1, . . . |C|}, |C| is the cardinality of C and tε{1, . . . , 7}. ƒD,t(c) then represents the intensity value at location cεC at timepoint t. A 2D T2-weighted (T2-w) MR image is defined as CT
Automated Boundary Segmentation on in vivo MR Imagery
A Multi-Attribute, Non-initializing, Texture Reconstruction based Active shape model (MANTRA) algorithm is described hereinabove. Unlike traditional ASMs, MANTRA makes use of local texture model reconstruction to overcome limitations of image intensity, as well as multiple attributes with a combined mutual information metric. MANTRA also requires only a rough initialization (such as a bounding-box) around the prostate to be able to segment the boundary accurately.
Step 1 (Training): PCA is performed on expert selected landmarks along the prostate border to generate a statistical shape model. A statistical texture model is calculated for each landmark point by performing PCA across patches of pixels sampled from areas surrounding each landmark point in each training image.
Step 2 (Segmentation): Regions within a new image are searched for the prostate border and potential locations have patches of pixels sampled from around them. The pixel intensity values within a patch are reconstructed from the texture model as best possible, and mutual information is maximized between the reconstruction and the original patch to test for a border location. An active shape model (ASM) is fit to such locations, and the process repeats until convergence.
Establishment of Cap Ground Truth on DCE-MRI Via Elastic Multimodal Registration of Histology, T2-w, and DCE-MRI
This task comprises the following steps:
3. MI-based affine registration of CT
For each pixel c within each DCE-MR image CD,t, t ε{1 . . . 7}, there is an associated intensity feature vector F(ci)=[ƒD,t(ci)/tε{1 . . . 7}], ciεC, iε{1 . . . , |C|}. LLE is used to embed the set ={F(c1), F(c2), . . . , F(cp)}, p=|C| to result in the set of lower dimensional embedding vectors χ={XLLE(c1), XLLE(c2) . . . XLLE(cp)}. Let {cη
subject to the constraints WLLE(i, j)=0 if cj does not belong to the mNN of ci and ΣjWLLE(i, j)=1. ci, cjεC. The low-dimensional projection of the points in that preserves the weighting in WLLE is determined by approximating each projection XLLE(ci) as a weighted combination of its own mNN. The optimal χLLE in the least squares sense minimizes
where tr is the trace operator, χLLE=[XLLE(c1), XLLE(c2) . . . XLLE(cp)], L=(I−WLLE)(I−WLLET) and I is the identity matrix. The minimization of (2.4.2) subject to the constraint χLLEχLLET=I (a normalization constraint that prevents the solution χLLE=0) is an Eigenvalue problem whose solutions are the Eigenvectors of the Laplacian matrix L.
Unsupervised Classification Via Consensus k-Means Clustering:
To overcome the instability associated with centroid based clustering algorithms, N weak clusterings {tilde over (V)}n1, {tilde over (V)}n2, . . . , {tilde over (V)}nk, nε{0, . . . , N} are generated by repeated application of k-means clustering for different values of kε{3, . . . , 7} on the low dimensional manifold XLLE(c), for all cεC, and combine them via consensus clustering. As a priori the number of classes (clusters) to look for in the data is unknown, k is varied to determine up to 7 possible classes in the data. A co-association matrix II is calculated with the underlying assumption that pixels belonging to a natural cluster are very likely to be co-located in the same cluster for each iteration. H(i, j) thus represents the number of times ci, cjεC, i≠j were found in the same cluster {tilde over (V)}nk over N iterations. If II(i, j)=N then there is a high likelihood that ci, cj do indeed belong to the same cluster. Multidimensional scaling (MDS) is applied to H, which finds optimal positions for the data points ci, cj in lower-dimensional space through minimization of the least squares error in the input pairwise similarites in H. A final unsupervised classification via k-means is used to obtain the stable clusters Vk1, Vk2, . . . , Vkk, q=k; for all kε{3, . . . , 7}.
Results
Qualitative Results
Representative results from experiments on 21 DCE-histology slice pairs are shown in
of the contrast agent uptake. When w is close to 1, the corresponding pixel is identified as cancerous area (red), when w, is close to zero, the pixel is identified as benign (blue), and green pixels are those are identified as indeterminate.
Quantitative Evaluation Against Registered CaP Ground Truth Estimates on DCE
For each of 21 slices, labels corresponding to the clusters Vk1, Vk2, . . . , Vkq, q=k, for each kε{3, 4, 5, 6, 7} are each evaluated against the registered CaP extent on DCE-MRI (GR(CD,5)). The cluster label showing the largest overlap with this ground truth is then chosen as the cancer class. This class is used to calculate the sensitivity, specificity, and accuracy of the CAD system at a particular k value for the slice under consideration. These values are then averaged across all 21 slices and are summarized in Table 1. The maximum sensitivity observed is 60.64% (k=3), the maximum specificity is 84.54% (k=7), and the maximum accuracy is 77.20% (k=7). A reduction in sensitivity is observed, as k increases from 3 to 7, with a corresponding increase in specificity and accuracy. Using the 3TP technique (which assumes that only 3 classes can exist in the data), a sensitivity of 41.53% and specificity of 70.04% is obtained. It is evident, that the technique diclosed has an improved performance as compared to the popular state-of-the-art 3TP method across kε{3, 4, 5, 6, 7}.
Comparison Against Existing Prostate DCE CAD
Analyzing the results by Vos et al. [Vos, P., Hambrock, T., et al.: Computerized analysis of prostate lesions in the peripheral zone using dynamic contrast enhanced MRI. Medical Physics 35(3) (2008) 888-899] in differentiating between non-malignant suspicious enhancing and malignant lesions in the prostate, reveal that their sensitivity of 83% corresponds to a 58% specificity. These values were obtained by only considering the peripheral zone of the prostate. Comparatively the metrics (60.64% sensitivity, 84.54% specificity, and 77.20% accuracy) have been achieved when examining the whole of the prostate while utilizing a more rigorously registered CaP extent for evaluation.
Recently, Magnetic Resonance Imaging (MRI) has emerged as a promising modality for prostate cancer (CaP) detection. Initial results using high resolution 3 Tesla (T) endorectal in vivo prostate MRI suggest significant improvements in both the contrast and resolution of internal structures over 1.5 T MRI as well as ultrasound imaging. Additionally, a focus of recent work in prostate MRI has been on the integration of multi-protocol MR data [2] to make use of different types of information to detect CaP. Other related work in this field has focused on integration approaches for these data, but have not definitively indicated that such information fusion improves diagnostic accuracy. These attempts have made use of just the T2-w intensity value or simple T2-w based intensity features. Textural representations of T2-w data can be used to better characterize CaP regions, as they can better capture the inherent structural information. Integrating functional information with such textural representations of structure will therefore significantly improve the classification accuracy.
Experimental Design:
Notation and Pre-Processing
A 2D prostate T2-w MR image is represented as ζT2=(C,ƒT2) where C is a finite 2D rectangular array of pixels cεC and ƒT2 is a function that assigns an intensity value to every cεC. The 2D prostate DCE image is represented as ζD,t=(C,ƒD,t) where ƒD,t is a function that assigns an intensity value to every cεC at time point t, tε{1, . . . , 7}. G(ζT2) and G(ζD) represent the set of locations on ζT2 and ζD,t that form the spatial extent of CaP obtained by registration of corresponding whole mount histological sections and MR images via the previously presented Combined Feature Ensemble Mutual Information (COFEMI) scheme in conjunction with thin-plate spline warping. The histological CaP extent is first mapped onto ζT2 to obtain G(ζT2). Affine alignment of the higher resolution ζT2 to the lower resolution ζD,5 (chosen due to improved contrast) allows to then map this CaP extent onto ζD,5 to obtain G(ζD,t). The ITK toolkit's BiasCorrector algorithm was used to correct each of the original 2D MR images, ζT2 and ζD,t, for bias field inhomogeneity. Intensity standardization was then used to correct for the non-linearity in MR image intensities on ζT2 alone to ensure that the T2-w intensities have the same tissue-specific meaning across images within the same study, as well as across different patient studies. All data was analyzed at the DCE-MRI resolution, with appropriate alignment being done as described.
Six texture features (steerable features (Sobel-Kirsch operators), first order statistical features (standard deviation operator), and second order statistical (Haralick) features including intensity average, entropy, correlation, and contrast inverse moment) that have previously demonstrated good separation between cancer and non-cancer classes were extracted from ζT2. The features described in Chan et al., 2003 [Chan, I., W. Wells III, et al. (2003). “Detection of prostate cancer by integration of line-scan diffusion, T2-mapping and T2-weighted magnetic resonance imaging; a multichannel statistical classifier.” Medical Physics 30(9): 2390-2398] were also extracted to compare scheme disclosed herein, with a related multi-protocol approach. Multiple possible combinations of the information contained in the structural and functional images are considered. Five feature sets of data Fψ, ψε{D, T2f, ints, feats} are created for each pixel in each of the images:
DCE intensity vector FD(c)=[ƒD,t(c)|tε{1, . . . , 7}]
T2-w intensity alone FT2(c)=[ƒT2(c)]
The classifier used is based on the voting of results from an ensemble of bagged (bootstrapped aggregated) decision trees. For a given training set Πψ, ψε{D, T2ƒ, ints, feats}, N bootstrapped subsets Siψ, iε{1, . . . , N}, N=50, are created with replacement of the training data. Based on each training subset Siψ a C4.5 decision tree classifier ξiψ is constructed while considering each pixel c as an individual sample. The classification result ∂iψ(c) for pixel c is obtained using decision tree ξiψ; ∂iψ(c)=1 if c is classified as cancerous and ∂iψ(c)=0 otherwise. The final classification is also done on a pixel basis and is the majority vote of the classification results from each individual decision tree ∂iψ. For each pixel c, it is possible to assign the frequency of being labeled cancerous as
This frequency is analogous to the probability of a particular pixel being cancer, thus a high value of P(c) indicates a high probability of a pixel c being cancerous. This yields a “frequency map” of cancer presence where brighter regions on the image will indicate a higher probability of cancer.
Results:
Sample classification results are shown in
m) shows the best Receiver-Operating Characteristic (ROC) curves obtained from among the 10 slices with different colours corresponding to the different feature sets, obtained via leave-one-out cross validation using permutations of the 10 images. The highest AUC (area under the ROC curve) value is for Ffeats (shown in red), while the lowest is for FT2 (shown in purple). AUC values averaged over 10 slices for each of the different feature sets are summarized in Table 1 with corresponding standard deviations.
Paired student t-tests were conducted using the AUC, specificity and sensitivity values at the operating point of the respective ROC curves for each image to determine the significance of the results (Tables 2, 3 and 4) using different feature combinations. It was found that Ffeats significantly (p<0.05) outperformed classification based on any of {FT2, FT2ƒ, Fints} in terms of AUC. Additionally, it is evident that classification based on simple thresholding of FT2 is extremely poor, showing the worst performance in terms of sensitivity, specificity and AUC. The superior performance when using Ffeats suggests that integrating structural textural features and functional information performs better compared to any individual modality.
Comparison to Previous Work:
A simple comparison is shown in Table 5 of relative runtimes and accuracies with a related method for CaP detection on multi-protocol MRI (described in [2]) and the proposed technique. All experiments were conducted using MATLAB 7.6 (Mathworks Inc.) on a 32 GB RAM, 2 dual core 2.33 Ghz 64-bit Intel Core 2 processor machine. It is clear that the proposed method offers significant improvements even with differences in datasets and protocols being taken into account. It must be noted that the technique in Chan et al, 2003 was observed to have significant implementation and runtime issues, as well as demonstrating acute sensitivity to training data. Comparatively the method was more robust due to the use of bagging, and does not suffer from the redundancy in feature space that has also been noted in Chan et al, 2003.
Contributions:
Having described preferred embodiments of the invention with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments, and that various changes and modifications may be effected therein by those skilled in the art without departing from the scope or spirit of the invention as defined in the appended claims.
This application is a 371 national phase application of PCT International Application No. PCT/US2008/081656 filed on Oct. 29, 2008 which claims priority from U.S. Provisional Application No. 60/983,553, filed Oct. 29, 2007. The contents of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2008/081656 | 10/29/2008 | WO | 00 | 9/16/2010 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2009/058915 | 5/7/2009 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6154560 | Cothren et al. | Nov 2000 | A |
7099499 | Blezek et al. | Aug 2006 | B2 |
7983732 | Chen et al. | Jul 2011 | B2 |
20020159622 | Schneider et al. | Oct 2002 | A1 |
20020181786 | Stark et al. | Dec 2002 | A1 |
20040142496 | Nicholson | Jul 2004 | A1 |
20040236208 | Amiel et al. | Nov 2004 | A1 |
20050111719 | Pescatore et al. | May 2005 | A1 |
20060235812 | Rifkin et al. | Oct 2006 | A1 |
20070053554 | Fayad et al. | Mar 2007 | A1 |
20070053589 | Gering | Mar 2007 | A1 |
20070081724 | Zhang et al. | Apr 2007 | A1 |
20070249928 | Blezek et al. | Oct 2007 | A1 |
Number | Date | Country | |
---|---|---|---|
20100329529 A1 | Dec 2010 | US |
Number | Date | Country | |
---|---|---|---|
60983553 | Oct 2007 | US |