1. Field of the Invention
The present invention relates to a method, apparatus and computer readable medium that involves a novel multiscale system that combines segmentation with classification to detect abnormal or anatomical body structures in medical applications in general, and more particularly, in medical imagery, and still more particularly, brain structures in medical imagery.
2. Prior Art
Identifying 3D body structures, such as, 3D brain or other tissue or bone structures, in medical applications in general and more specifically in medical imagery, particularly in MRI (Magnetic Resonance Imaging) scans, is important for early detection of tumors, lesions, and abnormalities, with applications in diagnosis, follow-up, and image-guided surgery. Computer aided analysis can assist in identifying such structures, and in particular, brain, other tissue and bone structures, extract quantitative and qualitative properties of structures, and evaluate their progress over time. In this document a novel method and apparatus for detecting abnormal or anatomical tissue or bone structures, such as, brain structures is presented, focusing on 3D MRI brain data containing scans of multiple sclerosis (MS) patients as a specific example.
Automatic detection of abnormal brain structures, and particularly MS lesions, is difficult. Abnormal structures exhibit extreme variability. Their shapes are deformable, their location across patients may differ significantly, and their intensity and texture characteristics may vary. Detection techniques based on template matching [4]1 or more recent techniques based on constellations of appearance features (e.g., [5]), which are common in computer vision, are not well suited to handle such amorphous structures. Consequently, with few exceptions (e.g., [11]) medical applications commonly approach this problem by applying classification algorithms that rely on a voxel-by-voxel analysis (e.g., [14], [15], [16], [17]) These approaches, however, are limited in their ability to utilize regional properties, particularly properties related to the shape, boundaries, and texture. 1 Numbers refer to referenced literature listed at end of specification
The present invention introduces a novel multiscale method and apparatus that combines segmentation with classification to detecting abnormal or anatomical tissue or bone structures, and in particular, 3D brain structures. The method is based on a combination of a powerful multiscale segmentation algorithm, Segmentation by Weighted Aggregation (SWA) ([12], [7]), a rich feature vocabulary describing the segments, and a decision tree-based classification of the segments. By combining segmentation and classification, integrative and regional properties are able to be utilized that provide regional statistics of segments, characterize their overall shapes, and localize their boundaries. At the same time, the rich hierarchical decomposition produced by the SWA algorithm allows to a great extent circumventing inaccuracies due to the segmentation process. Even when a lesion is not segmented properly one can generally expect to find some aggregate in the hierarchy that sufficiently overlaps it to allow classification.
The SWA algorithm is adapted to handle 3D multi-channel MRI scans and anisotropic voxel resolutions. These allow the algorithm to handle realistic MRI scans. The bank of features used characterizes each aggregate in terms of intensity, texture, shape, and location. These features were selected in consultation with expert radiologists. All the features are computed as part of the segmentation process, and they are used in turn to further affect the segmentation process. The classification step examines each aggregate and labels it as either lesion or non-lesion. This classification is integrated across scale to determine the voxel classification of the lesions. The utility of the method is demonstrated through experiments on simulated and real MRI data showing detection of MS lesions.
Other and further objects and advantages of the invention will become apparent from the following detailed description taken in conjunction with the drawings.
The following description is organized as follows. First, the segmentation procedure is presented; next, the feature extraction method and the classification model in the novel system; next results based on simulated and real MRI data are presented; next the hardware and software used for the method and apparatus and finally a discussion and conclusions.
First, the system for detecting abnormal brain structures is described. In a training phase of the system, several MR scans along with a delineation of the lesions in these scans are obtained and input into the hardware. The system uses segmentation to provide a complete hierarchical decomposition of the 3D data into regions corresponding to both meaningful anatomical structures and lesions. Each aggregate is equipped with a collection of multiscale features. Finally, a classifier is trained to distinguish between aggregates that correspond to lesions from those that correspond to non-lesions.
Once the classifier is trained, the novel system is ready to proceed to apply the novel method to unlabeled test data. At this stage the system obtains as input an MRI scan of a single brain. It then segments the scan and extracts features to describe the aggregates. Finally, each aggregate is classified as either a lesion or a non-lesion, and the voxel occupancy of the lesions is determined.
One of the features used to describe an aggregate is its location in the brain. To utilize this property, first each scan is brought to a common coordinate system. In the implementation this was achieved using the SPM software package [6], which registers a scan to an atlas composed of subject average of 152 T1-weighted scans.
The Segmentation by Weighted Aggregation (SWA) algorithm [12], [7]) is used, which is extended to handle 3D multi-channel and anisotropic data. In this section the SWA algorithm is reviewed along with the novel extensions.
Given a 3D MRI scan, a 6-connected graph G=(V,W) is constructed as follows. Each voxel i is represented by a graph node i, so V={1, 2, . . . , N} where N is the number of voxels. A weight is associated with each pair of neighboring voxels i and j. The weight wij reflects the contrast between the two neighboring voxels i and j
wij=e−α|I
where Ii and Ij denote the intensities of the two neighboring voxels, and α is a positive constant. (α=15 in our experiments). We define the saliency of a segment by applying a normalized-cut-like measure as follows. Every segment S⊂V is associated with a state vector u=(u1, u2, . . . , un), representing the assignments of voxels to a segment S
The saliency Γ associated with S is defined by
that sums the weights along the boundaries of S divided by the internal weights. Segments that yield small values of Γ(S) are considered salient. The matrix W includes the weights wij, and L is the LaPlacian matrix of G. Our objective is to find those partitions characterized by small values of . To find the minimal cuts in the graph, we construct a coarse version of this graph. This coarse version is constructed so that we can use salient segments in the coarse graph to predict salient segments in the fine graph using only local calculations. This coarsening process is repeated recursively, constructing a full pyramid of segments (
The coarsening procedure proceeds recursively as follows. Starting from the given graph
we create a sequence of graphs G[1] . . . G[k] of decreasing size (
As in the general AMG setting [1], the construction of a coarse graph from a fine one is divided into three stages: first a subset of the fine nodes is chosen to serve as the seeds of the aggregates (the latter being the nodes of the coarse graph). Then, the rules for interpolation are determined, establishing the fraction of each non-seed node belonging to each aggregate. Finally, the weights of the edges between the coarse nodes are calculated.
Coarse seeds: The construction of the set of seeds C, and its complement denoted by F, is guided by the principle that each F-node should be “strongly coupled” to C. To achieve this objective we start with an empty set C, hence F=V, and sequentially (according to decreasing aggregate size defined herein) transfer nodes from F to C until all the remaining IεF satisfy
where β is a parameter (in the experiments β=0.2).
The coarse problem: We define for each node
IεF a coarse neighborhood Ni={jεC,ωij>0} Let I(j) be the index in the coarse graph of the node that represents the aggregate around a seed whose index at the fine scale is j. An interpolation matrix P (of size N×n, where n=|C|) is defined by
This matrix satisfies u≈PU, where u[s]=(u1[s], u2[s], . . . , uN[s]) is the coarse level state vector. PII represents the likelihood that an aggregate i at a fine level belongs to an aggregate l at a coarser level. Finally, an edge connecting two coarse aggregates p and q is assigned with the weight:
Denoting the scale by a superscript G[0]=(V[0],E[0],W[0]). Note that since=u[s−1]=Pu[s], the relation Eq. (2) inductively implies that a similar expression approximates r at all levels. However, W[s] is modified to account for aggregative properties. We modify wpq[s] between a pair of aggregates p and q at scale s by multiplying it with an exponentially decreasing function of their aggregative properties distance.
Table 1 summarizes the segmentation algorithm.
Common MRI data is anisotropic, less vertically resolved. The SWA algorithm, however, assumes that the voxels in the fine level are equally spaced. Ignoring this effect may lead to distorted segmentations. To solve this problem we modify the algorithms as follows. During the first few coarsening steps we consider each 2D slice separately in performing seed selection and inter-scale interpolation (steps 1-2 in Table 1) allowing non-zero interpolation weights only between nodes of the same slice. The rest of the steps (steps 3-5 in Table 1) are performed on the full 3D graph, i.e., taking into account inter-slice connections. This procedure is repeated until the inner- and inter-slice distances are approximately equal. Subsequent coarsening steps consider the full 3D graph.
For example, consider data with 5mm slice thickness versus 1mm×1mm in-slice resolution. Every coarsening step of the SWA algorithm typically reduces the number of nodes by a factor of 2.5-3. Consequently, if we apply the algorithm to a 2D slice, the distance between neighboring nodes in a slice grows at every level by a √{square root over (2.5)}-√{square root over (3)} factor on average, so three coarsening steps are needed to bring the inner- and inter-slice distances to be roughly equal.
A major aspect of MR imaging is the large variety of pulse sequences that can be applied. These sequences produce different images for the same tissue, highlighting different properties of the tissue. We incorporate multi-channel data in the algorithm in a fairly straightforward manner. Given a multi-channel scan, each voxel now includes a vector of intensities. The initialization step Eq. (1) is modified to determine the initial weights utilizing intensity information from all m channels as follows:
where αc are pre-determined constants (αT2=15, αPD=αT1=10) and Iic is the intensity of voxel i in channel c. In addition, we maintain different sets of aggregative features for every channel (see herein below) and use these properties to modify the edge weights at coarser levels.
Lesions can often be characterized by properties of aggregates that emerge at intermediate scales, and are difficult to extract by any uni-scale procedure. Such properties may include, for instance, intensity homogeneity, principal direction of the lesion, and intensity contrast with respect to neighboring tissues. Voxel-by-voxel analysis is limited in the ability to utilize such scale-dependent properties.
We refer to such properties as aggregative features. The weighted-aggregation scheme provides a recursive mechanism for calculating such properties along with the segmentation process. We use these properties for two purposes. First, we use these aggregative properties to affect the construction of the segmentation pyramid. Second, these properties are available for the classification procedure, see below.
For an aggregate k at scale s we express an aggregative property as a number reflecting the weighted average of some property q emerged at a finer scale r, (r≦s). For example, the average intensity of k is an aggregative property, since it is the average over all intensities measured at the voxels (nodes of scale r=0) that belong to k. More complex aggregative properties can be constructed by combining several properties (e.g., variance below) or by taking averages over aggregative properties of finer scales (e.g., average of variances below). We denote such a property by Qk[r][s], and shorten this to Q[r]: when the context is clear.
In addition to these properties we can define binary aggregative properties, reflecting relations between two aggregates k and l at scale s. Such properties, denoted by Qkl, are useful for describing boundary relations between neighboring tissues, e.g., surface area of boundary between k and l or the contrast between the average intensity of an aggregate k and the average intensity of its neighbors.
The aggregative properties of an aggregate k are in fact averages over its sub-aggregates properties. Such properties can be accumulated from one level of scale to the next with the interpolation weights determining the relative weight of every sub-aggregate. For a detailed description on the accumulation of such properties see [7].
Construction of the classifier based on these features requires consideration of the inter-subject and intra-subject variability; therefore all features were normalized for each brain.
Table 2 lists the features for aggregate k at scale s. The features were selected based on interaction with expert radiologists. However, the effect of each feature in classification is determined by an automatic learning process.
Once the MRI scan is segmented and features are computed, so that each aggregate is characterized by a high-dimensional feature vector f (see Table 1), we proceed to the classification stage. A classifier utilizing multiple decision trees [2] is trained using labeled data. Then, given an unlabeled scan the classifier is used to detect the lesions. The classification is described below.
To construct the decision tree classifier, a learning process is applied using MRI scans with MS lesions delineated by experts. The process obtains two kinds of data. (1) A collection of M candidate segments, Cand={f1 . . . , fM} each is described by a d-dimensional feature vector (each feature is normalized to have zero mean and unit variance), and (2) a mask indicating the voxels marked as lesions by an expert. Since many of the candidate segments may contain a mixed collection of lesion and non-lesion voxels we label as a lesion a segment in which ≧70% of its voxels were marked by an expert as lesion. This class is denoted by c1. Further, only those segments that do not contain lesion voxels at all are marked as non-lesions. This class is denoted by c2. The rest of the segments are ignored at the training stage.
Next, the training data is used to construct multiple decision trees. A subset of the segments are randomly selected and used to construct a tree from the root downwards. At the root node all the labeled segments are considered and are repeatedly split into two subsets. At each tree node a Fisher Linear Discriminant (FLD) [4] is applied to the data determining the optimal separation direction and threshold s that leads to a maximal impurity decrease. This training procedure results in a forest of K decision trees T1, . . . , TK each trained with a random selection of segments.
During the testing phase an unseen MRI scan is obtained. After segmentation and feature extraction we classify every segment f by each of the K trees. Each tree Tq then determines a probability measure PT
Finally, a test segment is assigned with the label cj that maximizes this mean.
The classification process is applied to three segmentation scales, corresponding to small, intermediate, and large segments respectively. For each of these scales there is constructed a separate forest consisting of K=100 trees, trained with a random selection of Ns≦3000 patterns. The candidate segments for classification may overlap, so that a voxel may belong to more than one segment. To measure the total lesion load (TLL) it is necessary to generate a result in terms of voxels.
The classifier labels the candidate segments as lesion or non-lesion with some probability. All candidates are projected onto the data voxels using the interpolation matrix. Therefore, the interpolation matrix (eq. (3)) determines an association weight for each voxel and candidate. A voxel belongs to a candidate if the corresponding association weight ≧0.5. The maximum probability over all candidates to which the voxel belongs, determines the probability of the voxel to be a lesion. Further, there is employed both a hard and a soft classification of voxels. In the hard classification a voxel is classified as a lesion if its probability to be a lesion ≧0.5. However, since the ‘ground truth’ of the lesions may vary among different experts it might be helpful to provide a soft classification of the candidates rather than just a binary result. To create the soft classification, each 2D slice is first normalized by the average intensity of the intra-cranial cavity (ICC) in the related 2D slice. Then, by selecting from the hard assignment only voxels with normalized values above a certain threshold (1.75, 1.3 for multi-channel, FLAIR data respectively) one can determine a specific soft assignment, which is denoted as automatic classification result.
Below is presented validation results of employing the novel integrated system to both simulated and real MR data.
Before applying classification candidates are eliminated whose properties differ considerably from those expected from a lesion, Those include very non-salient regions (saliency >7), very large regions (volume >5000 voxels), regions located very close to the mid-sagittal plane (|x|<6), and very dark regions (intensity <0.75 and contrast to neighborhood <−0.25, where both are divided by the average ICC intensity). In addition we eliminate aggregates that overlap with anatomical structures where as a rule lesions do not develop. Those include the eyes and the cerebro-spinal fluid (CSF). To identify those structures we currently mark the segments corresponding to those structures manually. These structures can be identified automatically by considering an atlas, as will be described further on in this document. We further use the automatic skull stripping utility (Brain Extraction Tool [13]) to identify the brain region and eliminate segments that exceed beyond these regions.
The segmentation complexity is linear in the number of voxels. The complexity for generating a tree classifier is O(d2Ns log(Ns)+d3Ns+dNs(log(Ns))2) and dominated by O(dNs(log(Ns))2), where Ns is the number of training patterns and d is the number of features. The testing complexity is O(d log(Ns)) per one test sample.
First are presented results of the novel integrated system on the Simulated Brain Database (SBD) provided by the McConnell Brain Imaging Center ([3]). Currently, the SBD contains three simulated models of brains (phantoms) with ‘mild’, ‘moderate’, and ‘severe’ levels of MS lesions. The invention was tested on the three MS phantoms each including T1, T2 and PD images (see
The multi-channel experiment was performed on the three channels for 30 slices, which contain 80% of the lesion load. The MS lesions presented in these models are not symmetric between the left and right lobes. Training was performed on the right half of all three brain models and testing on the left half of the brains, where the midpoint was defined by the midsagittal plane. The detection rate measures the percentage of correct classifications of candidate segments in the test set (see definitions in sec. 0). The classification forests of the segments test set on all scales obtained a detection rate of (1, 0.99, 0.99) for the lesion class (c1), non-lesion class (c2) and total candidate set, respectively.
Denote (S) as a set of voxels detected as lesions by the system and (R) as the set of voxels labeled as MS lesions in the ‘ground truth’ reference. nS, nR denote the number of connected components (lesions) in S and R correspondingly.
Table 3 lists classification measures which are commonly used (e.g., [9], [14], [17]). These measures are presented in Table 4 and Table 6.
Table 4 shows results obtained after overlaying the candidates from all scales detected as MS by the forest classifiers.
To compare the results with other methods, there was applied the automatic classification of the detected area using one specified threshold for all subjects. We obtained an average of κ=0.80±0.11 (mean±S.D) on all three phantoms. In comparison, the authors in [17] tested their pipeline on the simulated data with varying levels of noise and INU. Their best classification accuracy reported for the single condition with the same parameters used in our tests was 0.81.
To further evaluate our approach on clinical images, which reflect the full range of pathological variability, the novel algorithm was tested on real MR data [10]. This study consists of 16 subjects for which MS lesions were manually traced by a human expert. In this case, single channel FLAIR images were used, which are known for their high sensitivity to lesions, offering a diagnostic capability beyond other sequences. The voxel size used is 0.97mm×0.97mm or 0.86mm×0.86mm (for 6 and 10 subjects respectively), with slice thickness 5mm (24 slices). The data was divided as follows: set A includes examination of 12 patients and set B includes examinations of four additional patients which had a monthly follow up, so that four time points were available for each patient.
Throughout the classification stage ten experiments were conducted. In each experiment, nine patients from set A were randomly selected for training. The test set consists of the remaining patients of set A and all patients of set B. In each one of the ten experiments three multiscale forests were generated.
Table 5 presents average detection rates for each scale over ten experiments.
Table 6 lists the average classification measures over the ten experiments for test sets A and B. We also assessed the significance of correlation coefficient between the ILL volume detected by expert and automatic segmentation for each set. The two upper rows in Table 6 demonstrate the results obtained for superior slices (above the eyeballs) where on average 0.88±0.05 of lesion volume occurs. The results in two lower rows were obtained on all slices. They are slightly lower due to the many artifacts in FLAIR data found in inferior slices.
Comparing to results reported in literature demonstrates the difficulty of the MS detection problem and reveals the high accuracy obtained by the invention. Correspondence results reported in [14] on multi-channel data were κ=0.45, 0.51, for 5mm,3mm slice thickness respectively. In [17] the average κ=0.6±0.07, whereas the κ similarity between pairs of 7 experts ranges from 0.51 to 0.67.
Over superior slices, our average was κ≧0.64. Results for all slices are comparable to the state-of-the-art (κ≧0.6). The extra volume exhibited by high FP measure may require further exploration. In our experiments, the main extra volume usually surrounds the lesion volume and the DFP is significantly small compared to the FP. Preliminary assessment of our results indicates that this extra volume is somewhat related to other WM classes (e.g. ‘dirty-appearing’ WM DAWM [8]). Moreover, the delineation of lesion volume varies significantly between different experts, i.e., volume ratios reported in literature may exceed 1.5 and even approach 3 ([14], [16], [17]). Therefore, it was concluded that the FP measure is in the range of the inter-rater variability.
Four sets of images that were acquired over four months (set B) were analyzed. Generally tests for robustness of reproducibility analysis should be performed on data rescanned repeatedly from the same brain. Here, since the interval between two scans was not short, the volume may also vary due to actual changes in patient pathology. However we performed a serial analysis and computed the ratio of volume difference between our detection and the ground-truth divided by the ground truth volume. The average results over time for the four subjects were (0.1±0.05, 0.06±0.06, 0.08±0.04, 0.39±0.11), respectively. For the last subject significantly worse results were obtained probably due to the considerably smaller TLL relative to the other three subjects.
The present invention (i.e., system or apparatus described in detail in this description of specific embodiments) can be implemented using a computer system as generally depicted in
In another example, the system and method of the present invention, may be implemented using a high-level programming language (e.g., C++) and applications written for the Microsoft Windows NT or SUN OS environments. It will be apparent to persons skilled in the relevant art(s) how to implement the invention in alternative embodiments from the teachings herein.
The Computer system of the invention includes one or more processors and can execute software implementing the routines described above. Various software embodiments are described in terms of this exemplary computer system. After reading this description, it will become apparent to a person skilled in the relevant art how to implement the invention using other computer systems and/or computer architectures.
The Computer system can include a display interface that forwards graphics, text, and other data from the communication infrastructure (or from a frame buffer not shown) for display on the display unit included as part of the system.
The Computer system also includes a main memory, preferably random access memory (RAM), and can also include a secondary memory. The secondary memory can include, for example, a hard disk drive and/or a removable storage drive, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. The removable storage drive can read from and/or write to a removable storage unit in a well-known manner.
In alternative embodiments, a secondary memory may include other similar means for allowing computer programs or other instructions to be loaded into computer system. Such means can include, for example, a removable storage unit and an interface. Examples can include a program cartridge and cartridge interface (such as that found in video game console devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units and interfaces that allow software and data to be transferred from the removable storage unit to computer system.
The Computer system can also include a communications interface that allows software and data to be transferred between computer system and external devices via a communications path. Examples of communications interface can include a modem, a network interface (such as Ethernet card), a communications port, interfaces described above, etc. Software and data transferred via a communications interface are in the form of signals that can be electronic, electromagnetic, optical or other signals capable of being received by communications interface, via a communications path. Note that a communications interface provides a means by which computer system can interface to a network such as the Internet.
The present invention can be implemented using software running (that is, executing) in an environment similar to that described above with respect to
Computer programs (also called computer control logic) are stored in main memory and/or secondary memory. Computer programs can also be received via a communications interface. Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor to perform features of the present invention. Accordingly, such computer programs represent controllers of the computer system.
The present invention can be implemented as control logic in software, firmware, hardware or any combination thereof. In an embodiment where the invention is implemented using software, the software may be stored in a computer program product and loaded into computer system using a removable storage drive, hard disk drive, or interface. Alternatively, the computer program product may be downloaded to computer system over a communications path. The control logic (software), when executed by the one or more processors, causes the processor(s) to perform functions of the invention as described herein.
Embodiments of the invention can be implemented as a program product for use with a computer system such as, for example, the cluster computing environment shown in
In general, the routines executed to implement the embodiments of the present invention, whether implemented as part of an operating system or a specific application, component, program, module, object or sequence of instructions may be referred to herein as a “program.” The computer program typically is comprised of a multitude of instructions that will be translated by the native computer into a machine-readable format and hence executable instructions. Also, programs are comprised of variables and data structures that either reside locally to the program or are found in memory or on storage devices. In addition, various programs described herein may be identified based upon the application for which they are implemented in a specific embodiment of the invention. However, it should be appreciated that any particular program nomenclature that follows is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature.
It is also clear that given the typically endless number of manners in which computer programs may be organized into routines, procedures, methods, modules, objects, and the like, as well as the various manners in which program functionality may be allocated among various software layers that are resident within a typical computer (e.g., operating systems, libraries, API's, applications, applets, etc.) It should be appreciated that the invention is not limited to the specific organization and allocation or program functionality described herein.
The present invention can be realized in hardware, software, or a combination of hardware and software. A system according to a preferred embodiment of the present invention can be realized in a centralized fashion in one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system—or other apparatus adapted for carrying out the methods described herein—is suited. A typical combination of hardware and software could be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
Each computer system may include, inter alia, one or more computers and at least a signal bearing medium allowing a computer to read data, instructions, messages or message packets, and other signal bearing information from the signal bearing medium. The signal bearing medium may include non-volatile memory, such as ROM, Flash memory, Disk drive memory, CD-ROM, and other permanent storage. Additionally, a computer medium may include, for example, volatile storage such as RAM, buffers, cache memory, and network circuits. Furthermore, the signal bearing medium may comprise signal bearing information in a transitory state medium such as a network link and/or a network interface, including a wired network or a wireless network, that allow a computer to read such signal bearing information.
In another embodiment, the invention is implemented primarily in firmware and/or hardware using, for example, hardware components such as application specific integrated circuits (ASICs). Implementation of a hardware state machine so as to perform the functions described herein will be apparent to persons skilled in the relevant art(s) from the teachings herein.
Presented is a novel multiscale method and apparatus that combines segmentation with classification for detecting abnormal 3D brain structures. The focus was on analyzing 3D MRI brain data containing brain scans of multiple sclerosis patients. The method is based on a combination of a powerful multiscale segmentation algorithm, a rich feature vocabulary describing the segments, and a decision tree-based classification of the segments. By combining segmentation and classification it was possible to utilize integrative, regional properties that provide regional statistics of segments, characterize their overall shapes, and localize their boundaries.
The multiscale segmentation algorithm was adapted to handle 3D multi-channel MRI scans and anisotropic voxel resolutions. The rich set of features employed was selected in consultation with expert radiologists. All the features are computed as part of the segmentation process, and they are used in turn to further affect the segmentation process. The classification step examines each aggregate and labels it as either lesion or non-lesion. This classification is integrated across scale to determine the voxel occupancy of the lesions. We have demonstrated the utility of our method through experiments on simulated and real MRI data, including several modalities (T1, T2, PD and FLAIR). Comparison of the results to other automated segmentation methods applied to Multiple Sclerosis shows the high accuracy rates obtained by the system.
The approach is flexible with no restrictions on the MRI scan protocol, resolution, or orientation. Unlike common approaches the novel method does not require a full brain tissue classification into white matter (WM), gray matter (GM), and cerebro-spinal fluid (CSF), and it is not limited to finding the lesions in the WM only, risking the omission of sub-cortical lesions. Furthermore, the novel learning process requires only a few training examples, as shown specifically in the experiments.
We believe that the inventive method and apparatus can further be improved by better exploiting the rich information produced by the segmentation procedure. Other features that can characterize lesions, as well as features that can characterize dirty appearing white matter (DAWM) can be incorporated. Also of importance is to incorporate prior knowledge of anatomic structures into the framework using a brain atlas. Finally, the novel method and apparatus can be applied to other tasks and modalities in medical imaging.
There follows now a description of an atlas guided Identification of brain structures by combining 3D segmentation and SVM classification.
The following presents a novel automatic approach (method and apparatus) for the identification of anatomical brain structures in magnetic resonance images (MRI). The method combines a fast multiscale multi-channel three dimensional (3D) segmentation algorithm providing a rich feature vocabulary together with a support vector machine (SVM) based classifier. The segmentation produces a full hierarchy of segments, expressed by an irregular pyramid with only linear time complexity. The pyramid provides a rich, adaptive representation of the image, enabling detection of various anatomical structures at different scales. A key aspect of the invention is the thorough set of multiscale measures employed throughout the segmentation process which are also provided at its end for clinical analysis. These features include in particular the prior probability knowledge of anatomic structures due to the use of an MRI probabilistic atlas. An SVM classifier is trained based on this set of features to identify the brain structures. The invention was validated using a gold standard real brain MRI data set. Comparison of the results with existing algorithms displays the promise of the invention.
Accurate classification of magnetic resonance images (MRI) is essential in many neuroimaging applications. Quantitative analysis of anatomical brain tissue, such as, white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF) is important for clinical diagnosis, therapy of neurological diseases and for visualization and analysis ([18], [19], [20]). However, automatic segmentation of MRI is difficult due to artifacts such as partial volume effect (PVE), intensity non-uniformity (INU) and motion. The INU artifact also referred to as inhomogeneity or shading artifact causes spatial inter-scan variation in the pixel intensity distribution over the same tissue classes. It depends on several factors but is predominantly caused by the scanner magnetic field. Numerous approaches have been developed for Brain MRI segmentation (see surveys [18], [21], [22], [23]). Low-level classification algorithms (which are compared in the following) typically involve common unsupervised algorithms including k-means and fuzzy c-means ([19],[21]). Yet, low level techniques that exploit only local information for each voxel and do not incorporate global shape and boundary constraints are limited in dealing with the difficulties in fully automatic segmentation of brain MRI. Deformable models have also been proposed in [24] to find coupled surfaces representing the interior and exterior boundary of the cerebral cortex. These techniques benefit from consideration of global prior knowledge about expected shape yet their limitations come from dependence on initialization and from the computation time required for 3D segmentation.
Statistical approaches, which classify voxels according to probabilities based on the intensity distribution of the data, have been widely used [25]. These approaches typically model the intensity distribution of brain tissues by a Gaussian mixture model. Given the distribution, the optimal segmentation can be estimated by a maximum a posteriori (MAP), or a maximum likelihood (ML) formulation ([19],[20]). The expectation maximization (EM) is a popular algorithm for solving the estimation problem. It has been applied to simultaneously perform brain segmentation and estimate the INU correction [26]. The EM framework has been extended to account for spatial considerations by including a Markov Random Field [27] and by utilizing a brain atlas [28]. This paper introduces a fully automatic method to identify brain structures in MRI, utilizing the 3D segmentation framework presented in [29], which extends the algorithm presented in ([30],[31]) to handle 3D multi-channel anisotropic MRI data. The inventive work combines the fast multiscale segmentation algorithm with a support vector machine (SVM) classifier based on a novel set of features. Prior knowledge of anatomic structures is incorporated using an MRI brain atlas. In addition to these priors a set of regional features are computed for each aggregate, which includes intensity, texture, and shape features, accumulated during the aggregation process. Unlike existing studies the invention does not involve explicit correction of magnetic field inhomogeneities. The invention is validated by applying the inventive method to a standard data base with varying bias field and compare results to existing algorithms.
The following description is organized as follows: a description of the segmentation, feature extraction and classification process. Comparative experimental results for automatic detection of the major brain anatomical tissues are shown below.
The method begins with utilizing the segmentation algorithm presented in [29]. This algorithm has extended the 2D segmentation algorithm developed for natural images ([30], [31]) to apply it to 3D multi-channel anisotropic MRI data. The segmentation scheme is described briefly below (for more details see [29], [30], [31]), incorporated by reference herein. The method, which is derived from algebraic multigrid (AMG) [32], starts by assembling together adjacent voxels into small aggregates based on intensity similarity, each voxel being allowed to belong to several aggregates with different association weights. These aggregates are then similarly assembled into larger aggregates, then still larger aggregates, etc. The affiliations between aggregates are based on tunable statistical measures, which are called features. Features obtained for small aggregates at a fine level affect the aggregation formation of larger aggregates at coarser levels, according to feature resemblance. In this multiscale Segmentation by Weighted Aggregation (SWA), a pyramid (hierarchy) of aggregates is constructed from fine (bottom) to coarse (top), such that each aggregate at a finer level may be associated to several larger aggregates at a subsequent coarser scale, with different weights. This soft weighted aggregation allows the algorithm to avoid pre-mature local decisions and to detect segments based on a global saliency measure. The algorithm is very fast, since only its initial stage operates at the level of individual voxels. It collects statistics of filter responses, identifies regions of coherent textures, quantifies the shape of segments, their boundaries and their density variability at all scales, etc., allowing the emergence of image segments according to any desired aggregation and saliency criteria. Moreover, each segment emerges from the aggregation process equipped with a list of accumulated numerical descriptors, its features, which can then be used for classification and diagnosis processes.
A major aspect of MRI is the wide variety of pulse sequences (modalities) available for producing different images. Each modality gives rise to a different image that may highlight different type of tissues. In the present inventive work, segmentation was applied to a single T1 channel. However, applying segmentation (and likewise classification) simultaneously to images obtained by several channels can lead to superior results that usually cannot be achieved by considering just one channel. Another important aspect is the anisotropic nature of most clinical MRI data (with lower vertical resolution) which if not taken into account may lead to inaccurate analysis of the data. Relying on the flexibility of the segmentation framework, the inventive method applied a 3D multi-channel segmentation algorithm that can process several modalities simultaneously, and handle both isotropic data as well as anisotropic data.
The segmentation process computes statistical aggregative features throughout the pyramid construction. These features, which affect the formation of aggregates, are also available for the classification of anatomical structures at the end of the process. The development of the set of features is guided by interaction with expert radiologists, and the quantitative effects of the various features are determined by the automatic learning process described below. It can be shown that these properties can be calculated recursively (see [29], [31] for notations). In this work, the set of features was expanded to include information about the expected location of the major tissue types. The prior probability knowledge of anatomic structures was incorporated using an MRI probabilistic atlas. Employed were the International Consortium for brain mapping (ICBM) probability maps which represent the likelihood of finding GM, WM and CSF at a specified position for a subject that has been aligned to the atlas space. The ICBM452 atlas and brain data sets are brought into spatial correspondence using the Statistical Parametric Mapping (SPM) registration software [33] so that for every aggregate we can compute its average probability to belong to any of the three tissue categories. Construction of the classifier based on these features requires consideration of the inter-subject and intra-subject variability; therefore all features were normalized for each brain. Table 7 presents a partial list of the features for aggregate k at scale s used in the inventive work.
Support Vector Machine (SVM) Classification
A candidate set of segments was extracted from the intermediate level of the pyramid (scales 5, 6 from all 13 scales) which correspond to brain tissue regions. To construct the classifier, “ground-truth” expert segmentation was utilized, which is provided along with the real clinical brain MRI data. Generally, it was assumed (1) having a training sample of M candidate segments, Cand={f1 . . . fM} each is described by a d-dimensional feature vector (we normalize each of the features to have zero mean and unit variance), and (2) a mask indicating the voxels marked by an expert as GM, WM, CSF and background (BG). Since many of the candidate segments may contain a mixed collection of the categories, the labeling category is determined based on the category marked by the maximum number of voxels associated with the segment.
The table 8 lists the mean (±S.D) classification measures obtained on all 20 subjects for the four different classes.
Twenty normal MR brain data sets and their manual segmentations were provided by the Center for Morphometric Analysis at Massachusetts General Hospital and are available at “http://www.cma.mgh.harvard.edu/ibsr/”. The sets are sorted based on their difficulty level. The sets were separated to “odd” and “even” indexes. The classifier was trained on the odd sets and tested on the even sets and vice versa, so that both the training and the testing consist of ten sets including the entire range of difficulty. The training data was used to construct an SVM-based classifier. The results presented here were obtained by using a radial basis function RBF kernel (K(x; y)=eiojxiyj2). A detailed description of SVMs can be found in [34].
Following the learning phase, in the testing phase an unseen MRI scan is obtained. After segmentation and feature extraction we apply the SVM classifier to every candidate segment in the test set and finally assign a category-label to each candidate. All candidates segments are projected onto the data voxels using the segmentation interpolation matrix (see details in [12]). The maximum association weight of the voxel determines the segment to which the voxel belongs, which leads to an assignment of a class label to each voxel.
The integrated approach was tested on 20 coronal T1-weighted real MRI data set of normal subjects with GM, WM and CSF expert segmentations provided by the Internet Brain Segmentation Repository (IBSR), after they have been positionally normalized. The brain scans used to generate these results were chosen because they have been used in published volumetric studies in the past and because they have various levels of difficulty. This allows the assessment of the methods performance under varying conditions of signal to noise ratio, INU, PVE, shape complexity, etc. The inventive method was tested using 45 central coronal slices which contain 0:94±0:02 of the brain voxels including the cerebellum and brain stem.
The results presented were obtained by overlaying the candidate segments of the brain set tested according to their labeling category by the SVM classifier. The validation scores presented are based on the common measures for spatial overlap (e.g., [23], [36]). Denote by (S) the set of voxels automatically detected as a specific class and (R) the set of voxels labeled as the same class in the ‘ground truth’ reference. The classification measures used in Table 8 and 9 are defined as follows:
Table 9 and
MRI is considered the ideal method for brain imaging. The 3D data and the large number of possible protocols enable identification of anatomical structures, as well as, abnormal brain structures. In this work we focus on segmenting normal brains into three tissues WM, GM and CSF. The segmentation pyramid provides a rich, adaptive representation of the image, enabling detection of various anatomical structures at different scales. A key aspect of the invention is the comprehensive set of multiscale measurements applied throughout the segmentation process. These quantitative measures, which take into account the atlas information, can further be used for clinical investigation. For classification we apply automatic learning procedure based on an SVM algorithm using data pre-labeled by experts. Our approach is unique since it combines a rich and tunable set of features, emerging from statistical measurements at all scales. Our competitive results, obtained using a standard SVM classifier, demonstrate the high potential of such features.
The algorithm was evaluated on real 3D MRI brain scans, demonstrating its ability to detect anatomical brain structures. It requires no restrictions from the MRI scan protocols and can further be generalized to other tasks and modalities, such as to detect evolving tumors or other anatomical substructures. Continued work on the invention is anticipated and it will expand the method and apparatus to detection of further internal brain anatomical structures. Extraction of other classes of structures may require the use of additional features and perhaps a more detailed knowledge domain available in brain atlases will be of assistance.
Although the present invention has been described in terms of specific preferred embodiments, nevertheless modifications and changes will become apparent to those of skill in the art from the teachings herein. Such modifications and changes as will be readily apparent to those skilled in the art are deemed to fall within the purview of the invention as set forth in the claims appended hereto.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US06/49536 | 12/28/2006 | WO | 00 | 6/17/2010 |
Number | Date | Country | |
---|---|---|---|
60755393 | Dec 2005 | US |