The present invention relates to medical imaging of the brain, and more particularly, to automatic segmentation of sub-cortical and cortical brain structures in 3D magnetic resonance images.
Detection and delineation of sub-cortical and cortical brain structures in magnetic resonance (MR) data is an important problem in medical imaging analysis. For example, delineated sub-cortical and cortical brain structures can be used in detecting abnormal brain patterns, studying various brain diseases, and studying brain growth. Manual delineation of such sub-cortical and cortical brain structures in 3D MR data is a challenging task, ever for expert users, such as clinicians or neuroanatomists. Task-specific protocols for manual brain structure annotation exist, but the correct utilization of such protocols depends heavily on user experience. Moreover, even for experienced users, the manual delineation of a single brain structure is a time-consuming process, and manual annotations may vary significantly among experts as a result of individual experience and interpretation. Accordingly, a method for automatically segmenting sub-cortical and cortical brain structures in MR volumes is desirable.
Conventional techniques for segmenting sub-cortical and cortical brain structures cannot be used for purposes such as detecting abnormal brain patterns, studying brain diseases, and studying brain growth due to a lack of proper generalization. In particular, conventional methods may successfully perform segmentation of sub-cortical and cortical brain structures in MR data for healthy adult brains, but cannot reliably detect abnormal brain structures or brain structures in pediatric brains. The generalization of such conventional methods for the detection of abnormal brain structures is not trivial, and it has been necessary to develop specific methods that only work for specific cases. Accordingly, a method for sub-cortical and cortical brain structure segmentation that is robust enough to reliably segment abnormal brain structures and pediatric brain structures, as well as healthy adult brain structures is desirable.
The present invention provides a method and system for segmenting sub-cortical and cortical brain structures in 3D MR images. For example, embodiments of the present invention provide automatic segmentation of the brain structures of the left and right caudate nucleus, hippocampus, globus pallidus, putamen, and amygdala.
In one embodiment, a meta-structure including center positions of multiple brain structures is detected in a 3D MR image. At least one of the brain structures is then individually segmented using marginal space learning (MSL) constrained by the detected meta-structure. The MSL framework for individually detecting the brain structures can be extended to include shape inference in addition to position detection, position-orientation detection, and full similarity transformation detection.
These and other advantages of the invention will be apparent to those of ordinary skill in the art by reference to the following detailed description and the accompanying drawings.
The present invention is directed to a method and system for automatic segmentation of sub-cortical and cortical brain structures in 3D magnetic resonance (MR) images. Embodiments of the present invention are described herein to give a visual understanding of the sub-cortical and cortical brain structure segmentation method. A digital image is often composed of digital representations of one or more objects (or shapes). The digital representation of an object is often described herein in terms of identifying and manipulating the objects. Such manipulations are virtual manipulations accomplished in the memory or other circuitry/hardware of a computer system. Accordingly, is to be understood that embodiments of the present invention may be performed within a computer system using data stored within the computer system.
Embodiments of the present invention are directed to automated (sub)-cortical brain structure segmentation in 3D MR images. As used herein, a “(sub)-cortical brain structure” refers to any sub-cortical and/or cortical brain structure. For example, embodiments of the present invention segment a set of (sub)-cortical brain structures including the left and right caudate nucleus, hippocampus, globus pallidus, putamen, and amydgala. Embodiments of the present invention utilize a top-down segmentation approach based on Marginal Space Learning (MSL) to detect such (sub)-cortical brain structures. MSL decomposes the parameter space of each respective anatomic structure along decreasing levels of geometrical abstraction into subspaces of increasing dimensionality by exploiting parameter invariance. At each level of abstraction, i.e., in each subspace, strong discriminative models are trained from annotated training data, and these models are used to narrow the range of possible solutions until a final shape of the brain structure can be inferred. The basis MSL framework is described in greater detail in Zheng et al., “Four-Chamber Heart Modeling and Automatic Segmentation for 3D Cardiac CT Volumes Using Marginal Space Learning and Steerable Features”, IEEE T. Med. Imag. 27(11) (November 2008), pgs. 1668-1681, which is incorporated herein by reference. Contextual shape information for the brain structures is introduced by representing candidate shape parameters with high-dimensional vectors of 3D generalized Haar features and steerable features derived from observed volume intensities in an MR image.
For combined 3D rigid anatomy detection and shape inference, an extended MSL-based framework is used. A structure of interest's center is estimated as c=(c1, c2, c3)εIR3, orientation as RεSO(3), scale as s=(s1, s2, s3)ε{sεIR3|si>0, i=1, 2, 3}, and shape as x=(x1, y1, z1, . . . , xn, yn, zn)TεIR3n. The shape parameter comprises canonically sampled 3D points xi=(xi, yi, zi)T, iε{1, . . . , n}, on the surface of the object to be segmented. Note that R is relative to c, s is relative to c and R and x is relative to c, R, and s. Let V={1, 2, . . . , N}, NεIN, be a set of indices of image voxels, (yν)νεV, yνε{−1,1}, a binary segmentation of the image voxels into object and non-object voxels, and f be a function with Y=f(I,Θ) that provides a binary segmentation of volume I using segmentation parameters Θ=(c,R,s,x). Let Z=(zΘ) be a family of high dimensional feature vectors extracted from a given input volume I=(iν)νεV and associated with different discretized configurations of Θ. In embodiments of the present invention, Z can include voxel-wise context encoding 3D generalized features to characterize possible object centers and steerable features that are capable of representing hypothetical orientations and optionally scaling relative to a given object center or shape surface point.
In order to detect an individual brain structure, we search for an optimal parameter vector:
maximizing the posterior probability of the presence, i.e., y=1, of the target brain structure given the discriminative model M(Θ) and the features Z extracted from the input volume I using a certain set of values for the parameters Θ.
Let π(c)(Z), π(c,R)(Z), π(c,R,s)(Z), and π(c,R,s,x)(Z) denote the vectors of components of Z associated with individual groups of elements (c), (c,R), (c,R,s), and (c,R,s,x) of the parameter vector Θ. The MSL method avoids exhaustively searching the high-dimensional parameter space spanned by all possible Θ by exploiting the fact that ideally for any discriminative model for center detection with parameters M(c) working on a restricted amount of possible features,
holds, as the object center c is invariant under relative reorientation, relative rescaling, and relative shape positioning. Similarly, we have
for combined position-orientation detection with model parameters M(c,R), where only features π(c*,R)(Z) with c=c* are considered. This is due to the fact that position and orientation are invariant under relative rescaling and relative shape positioning. Analogous considerations yield
for the target object's scaling, and
for the target object's shape where M(c,R,s,x) are the parameters of a local shape model with respect to individual surface points x, and parameters M(c,R,s,x) represent a global shape model. Equations (2)-(5) set up a chain of discriminative models exploiting search space parameter invariance for combined 3D shape detection and shape inference. This allows different discriminative models to be applied in descending order of geometrical abstraction as, in embodiments of the present invention, the object center c alone is the most geometrically abstract and the complete set of parameters Θ is the least abstract shape representation. Therefore, MSL establishes a hierarchical decomposition of the search space along decreasing levels of geometrical abstraction with increasing dimensionality of the considered parameter subspace.
Let Z be the set of annotated image volumes in their transformed feature representation as described above. Z is referred to herein as the training data. In order to detect the nine parameter similarity transformation of the optimal parameter Θ*, i.e., c*, R*, and s*, discriminative models P(y=1|π(c*)(Z)), P(y=1|π(c*,R)(Z)), and P(y=1|π(c*,R*,s)(Z)) are learned (trained) based on the training data. Following the concept of MSL, a set of positive and negative training examples C={(π(C)(Z),y)|ZεZ} are generated from the training data to train a probabilistic boosting tree (PBT) classifier for position detection. The feature vectors π(C)(Z) can be 3D generalized Haar-like features encoding voxel context of candidate object centers based on observed intensity values. Decreasing the level of geometric abstraction, a PBT classifier is analogously trained for combined position-orientation detection based on an extended set of training examples G={(π(c,R)(Z),y)|ZεZ}, where the feature vectors π(c,R)(Z), associated with (c,R) and an image volume, are steerable features. Steerable features allow varying orientation and scaling to be encoded in terms of aligned and scaled intensity sampling patterns. According to an advantageous implementation, steerable features are also used to train a PBT classifier for full nine parameter similarly transformation detection based on an extended set of training examples S={(π(c,R,s)(Z),y)|ZεZ}, where π(c,R,s)(Z) is derived from (c, R, s) and the associated image volume.
In order to detect the final object shape for each individual brain structure, we further decompose
π(c,R,s,x)(Z)=(π(c,R,s,x
where π(c,R,s,x
in an iterative manner. The term p(yi=1|π(c,R,s,x
The global shape model can be implemented as an active shape model (ASM), which can be used to incorporate prior shape during segmentation. Active shape models are described in detail in Cootes et al. “Active Shape Models-Their Training and Application” Comp. Vis. Image Understand. 61(1) (January 1995), pgs. 38-59, which is incorporated herein by reference. In an ASM, the shape of a target structure is represented as a cloud of points, which are either manually or automatically placed at certain characteristic locations within the class of images to be processed. Once these sets of labeled point features, or landmarks, are established for each image, they are linearly aligned to each other in order to remove translation, rotation, and scaling as far as possible. This can be done using the generalized Procrustes analysis (GPA), which is well known and described in detail in Gower “Generalized Procrustes Analysis” Psychmetrika 40(1) (March 1975), pgs. 33-50, which is incorporated herein by reference. After the GPA all the shapes are transformed to a common coordinate system—the model space of the ASM. The remaining variability can be described as a prior model using a Point Distribution Model (PDM).
As described above, MSL-based detection can be used to separately segment each individual (sub)-cortical brain structure. However, using MSL for rigid detection of each brain structure independently may not be optimal. Accordingly, embodiments of the present invention take advantage of an inherent relationship between the positions of the (sub)-cortical brain structures (left and right caudate nucleus, hippocampus, globus pallidus, putamen, and amydgala) during the segmentation of the brain structures.
Referring to
At step 204, the voxel intensities of the received 3D MR image are standardized. This intensity standardization allows the segmentation method of
At step 206, a meta-structure including positions of each of the target (sub)-cortical brain structures is detected in the 3D MR image. The meta-structure is a composition of all of the target brain structures, and can be generated based on mean positions of the brain structures in the training data. The meta-structure can be a 3D shape that includes center locations of each of the target (sub)-cortical brain structures. Accordingly, the detection of the location, orientation, and scale of the meta-structure in the 3D MR image results in an estimate for the center position of each of the individual (sub)-cortical brain structure. According to an embodiment of the present invention, the meta-structure can be detected using marginal space learning to sequentially detect position, position-orientation, and a full similarity transformation of its associated 3D shape. For each stage of the marginal space learning detection, a discriminative PBT classifier can be trained based on training data.
The meta-structure detection exploits the relationship between the positions of the multiple (sub)-cortical brain structures to speed up detection and delineation of (sub)-cortical brain structures. Let NεIN+ be the number of target structures to be detected. Their hierarchical shape representation is a shape (si, Ri, ti, Xi) for iε{1, . . . , N}. A meta-structure (ŝ, {circumflex over (R)}, {circumflex over (t)}, {circumflex over (X)}) is defined with
{circumflex over (X)}=(t1
and ŝ, {circumflex over (R)}, and {circumflex over (t)} estimated based on {circumflex over (X)} via principal component analysis (PCA). This definition of the meta-structure enables the training of a chain of discriminative models (e.g., PBT classifiers) for rigid meta-structure detection similar to the rigid detection of any other individual (sub)-cortical brain structure as described above. Using GPA, a population meta-structure mean shape can be generated based on the annotated training data. Instead of iteratively adapting an initial shape, i.e., the mean shape, after it has been rigidly positioned according to the rigid detection result (ŝ*,{circumflex over (R)}*,{circumflex over (t)}*), as done for the shape inference described above, the initial estimate of {circumflex over (X)}* is used to constrain subsequent position detection steps for individual (sub)-cortical brain structures. In particular, position detection for each individual (sub)-cortical brain structure is performed exclusively on candidate voxels that wall within a certain radius of the corresponding meta-structure mean shape points {circumflex over (x)}i*, iε{1, . . . , m}.
Returning to
The position-orientation detection, full similarity transformation detection, and shape inference for each brain structure then proceeds as described above. In particular, a set of position-orientation hypotheses are generated based on the detected position candidates, and the trained position-orientation classifier detects a number of position-orientation candidates from the position-orientation hypotheses. A set of similarity transformation hypotheses are generated based on the detected position-orientation candidates, and the trained position-orientation classifier is used to detect the best similarity transformation from the similarity transformation candidates. The shape of the detected structure is then iteratively adjusted using the trained shape inference classifier, as described above.
Returning to
The above-described methods for segmenting multiple (sub)-cortical brain structures in 3D MR images may be implemented on a computer using well-known computer processors, memory units, storage devices, computer software, and other components. A high level block diagram of such a computer is illustrated in
The foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention. Those skilled in the art could implement various other feature combinations without departing from the scope and spirit of the invention.
This application claims the benefit of U.S. Provisional Application No. 61/098,273, filed Sep. 19, 2008, the disclosure of which is herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5751838 | Cox | May 1998 | A |
7136516 | Alyassin | Nov 2006 | B2 |
8068654 | Barbu | Nov 2011 | B2 |
20050244036 | Rusinek | Nov 2005 | A1 |
20050283054 | Reiman | Dec 2005 | A1 |
20070160277 | Slabaugh et al. | Jul 2007 | A1 |
20080298659 | Spence | Dec 2008 | A1 |
20080306379 | Ikuma | Dec 2008 | A1 |
20090030304 | Feiweier et al. | Jan 2009 | A1 |
20090316988 | Xu | Dec 2009 | A1 |
20100020208 | Barbu | Jan 2010 | A1 |
Entry |
---|
Cootes T.F. et al., “Active Shape Models—Their Training and Application”, Computer Vision and Image Understanding, 61(1), Jan. 1995, pp. 38-59. |
Cox, I.J. et al., “Dynamic Histogram Warping of Images Pairs for Constant Image Brightness”, In IEEE International Conference on Image Processing, vol. II, Washington D.C., USA, Oct. 1995. |
Georgescu, B. et al., “Database-Guided Segmentation of Anatomical Structures with Complex Appearance”, In IEEE Comp. Soc. Conf. Comp. Vis. Pat. Recog., San Diego, CA, USA, Jun. 2005. |
Gower, J.C., Generalized Procrustes Analysis, Psychometrika 40(1), Mar. 1975, pp. 33-50. |
Tu, Z., “Probabilistic Boosting-Tree: Learning and Discriminative Models for Classification, Recognition, and Clustering”, in IEEE Int'l. Conf. Comp. Vis., Beijing, China, Oct. 2005, pp. 1589-1596. |
Zheng, Y., et al., “Four-Chamber Heart Modeling and Automatic Segmentation for 3D Cardiac CT Volumes Using Marginal Space Learning and Steerable Features”, IEEE Transactions on Medical Imaging, 27(11), Nov. 2008, pp. 1668-1681. |
Number | Date | Country | |
---|---|---|---|
20100074499 A1 | Mar 2010 | US |
Number | Date | Country | |
---|---|---|---|
61098273 | Sep 2008 | US |