The present invention relates to image segmentation, and more particularly, to graph out image segmentation using a shape prior.
Image segmentation is used to distinguish and partition an object or region (foreground) of a digital image from the background of the digital image. Image segmentation is commonly used, for example, in medical image analysis. Another popular use for image segmentation is in digital photograph editing.
Segmentation is a fundamental task in image processing and numerous methods have been developed to attempt to accurately segment an image. Some image segmentation methods rely on energy minimization in order to partition an image into multiple regions. Such methods include active contour image segmentation methods and graph cut image segmentation methods.
In an active contour method, the energy is typically comprised of image terms, which are regional and/or boundary based, as well as intrinsic regularization terms. An initial contour, or closed curve, is formed on the image, and based on energy minimization, the initial contour iteratively deforms to move to the region or object of interest. Active contour methods, however, can be sensitive to the initialization of the contour, since the energy minimization is subject to local minima. Active contour methods can also be subject to “leaking”, which occurs when due to noise, clutter, poor contrast, etc., the image data does not provide enough information to stop the contour at the desired location.
In a graph cut image segmentation method, an energy minimization is performed on a graph. The graph is typically generated using vertices representing pixels of the image, as well as edges connecting the vertices, often using 4 or 8 neighborhood connectivity. It is also possible that the vertices of the graph could represent the connectivity of pixels in the image, while the edges of the graph represent the edges of the image. The energy in a graph cut image segmentation typically includes a region term that assigns penalties based on labeling a pixel as foreground or background, as well as a boundary term that assigns a penalty based on the dissimilarity of adjacent pixels. Edges connecting the pixels are cut so that each pixel is associated with either the foreground or the background of the image. The energy function to be minimized is typically the summation of weights of the edges that are cut. Conventional graph cut methods are not iterative, and typically achieve global minimization for an energy function. However, in conventional graph cut image segmentation methods, “leaking” can occur when an object has a weak boundary condition or is grouped together with another object having a similar intensity.
The present invention overcomes the foregoing and other problems encountered in the known teachings by providing a system and method for graph cut image segmentation using a shape prior. A shape prior is a shape constraint applied to the image segmentation method to segment from an image a particular shaped object representable by a known class of shapes.
In one embodiment of the present invention, an initial shape is applied to a portion of an image to be segmented. A narrowband is then formed around a border of the shape, and a minimized graph cut is calculated for a portion of the image within the narrowband. The shape is then adjusted by fitting a result of the minimized graph cut using a shape prior. The steps of forming the narrow band, calculating the minimized graph cut, and adjusting the shape can be iteratively performed, with each iteration using the adjusted shape from the previous iteration. The shape prior can be a parametric shape, such as an ellipse, or a statistical shape eigenspace calculated based on one or more training shapes.
The initial shape can be applied to the image by selecting a seed point on the image and forming the initial shape at the seed point. The seed point can be selected automatically by a computer or manually by a user. In the case of the shape prior being a statistical shape eigenspace, a mean shape of the statistical shape eigenspace can be used as the initial shape.
In order to calculate the minimized graph cut, a shape mask can be generated based on the shape, and the mean intensities of pixels inside and outside of the shape can be calculated. An energy function based on the shape mask and the mean intensities is calculated, and the graph cut that minimizes the energy function is determined.
These and other advantages of the invention will be apparent to those of ordinary skill in the art by reference to the following detailed description and the accompanying drawings.
According to an embodiment of the present invention, a graph cut image segmentation method using a shape prior is used to segment an image. A shape prior is a shape constraint applied to an image segmentation method to force a segmented object or region of an image into a particular shape. According to embodiments of the present invention, the shape prior used with the graph cut image segmentation method may be a parametric shape prior or a statistical shape prior.
In order to segment an object having a shape having a known mathematical representation from an image, a parametric shape prior can be used. A parametric shape prior uses a known parametric shape to constrain the image segmentation. For example, an elliptical shape constraint can be used to model a multitude of objects, including a wide variety of anatomical structures, such as blood vessels and lymph nodes. For example, the segmentation of lymph nodes is an important application in the staging of lymphatic cancer. Furthermore, the ellipse has a simple parametric equation that can be applied to a graph cut image segmentation method. Other parametric shapes can include, but are not limited to, triangles, squares, rectangles, general n-gons (polygons), circles, ellipses, superellipses, etc. in 2 dimensions, and spheres, spheroids, cubes, cones, cylinders, parapellipeds, general polygonal meshes, etc. in three dimensions. Although an elliptical shape constraint is described herein, the present invention is not limited thereto, and any other parametric shape can be used similarly to constrain the graph cut image segmentation.
In order to segment an object that does not have a known or simple mathematical representation from an image, a statistical shape prior (shape eigenspace) can be used. A statistical shape prior is a formed from a set of training shapes similar to the shape of the object to be segmented. A statistical shape prior compactly represents variation of a given set of training shapes by forming a shape eigenspace from the training shapes. In forming a statistical shape prior from training shapes, the ith aligned training shape, 1 . . . , N, can be represented as the zero level set of a signed distance function, Ψi, which is negative inside the shape and positive outside the shape. Given this set of training data, {Ψi, . . . , ΨN}, the mean shape can be computed as
Then, the shape variability about the mean can be determined by subtracting the mean shape from each training shape, resulting in N mean-offset functions {{tilde over (Ψ)}1, . . . , ΨN}-={Ψ1−
While the mean-offset functions capture the shape variation in the set of training shapes, the mean-offset functions are highly redundant. To represent this shape variation more compactly, principal components analysis (PCA) can be performed on the mean-offset functions to determine principal modes of variation. First, each column of an N1 by N2 mean-offset function {tilde over (Ψ)}i is stacked into a large column vector {tilde over (ψ)}i. Then, a shape variability matrix S can be formed as S=[{tilde over (ψ)}1 . . . {tilde over (ψ)}N], which has a size M by N, with M=N1N2.
Although, a singular value decomposition (SVD) can be performed on the covariance matrix
for typical images this M by M matrix is very large, so SVD becomes computationally expensive. Fortunately, it is possible to perform SVD on a much smaller matrix
which has size N by N. Computing this SVD
results in N eigenvalues and eigenvectors. If d is an eigenvector of
with eigenvector λ, the Sd is an eigenvector of
with eigenvector λ. Often, the majority of the variation in the mean-offset images is captured in just a few modes. Therefore, the k most significant modes are used in the representation.
Different shapes in the shape eigenspace can be represented as:
where ωi is a weight on the ith mode Φi. Likewise, if a new aligned shape is given, it is possible to computer its signed distance function Ψ and project it into the shape eigenspace by finding the weights as:
w=UkT(Ψ−
where Uk is a matrix comprising the first k eigenvectors of U. The projected shape can then be extracted from the zero level set of Φ(w) using Equation 1.
Before discussing specific aspects of the graph cut segmentation algorithm using a shape prior, graph theory will be discussed. In particular, an undirected graph G=<V,E> consists of vertices V and undirected edges E that connect the vertices. Each edge eεE is assigned a non-negative cost ωe. There are two special vertices (referred to herein as “terminals”) in the graph that are identified as the source s and the sink t With the exception of the terminals s and t, the vertices are comprised of pixels P of an image to be segmented. The image to be segmented is a digital image, and can be obtained using standard digital photography, as well as medical imaging technology, such as Magnetic Resonance Imaging, ultrasound, x-ray, computed tomography, SPECT, PET, IVUS, OCT, etc.
The cost of the cut is the sum of the costs of the edges that are severed by the cut, such that:
In order to select a cut C, a minimum cut (i.e., the cut with the smallest cost) must be determined. There are numerous algorithms for finding the minimum, as is well known in the art.
In order to perform a graph cut image segmentation for a set of pixels P of an image, it is possible to compute a labeling f that minimizes an energy function. The labeling f labels each pixel as either foreground or background. The energy function takes the form:
where E is the energy, p and q are pixels, and N is a neighborhood formed from the vertex connectivity. Here, connectivity refers to the way edges are formed between adjacent pixels in the image. For example, in two dimensions, 4-connectivity implies forming edges between a pixel p and its neighboring pixels to the right, left, up, and down. The connectivity defines the topology of the graph. Dp(fp) is a region term that measures the cost of assigning the label fp (foreground or background) to pixel p, while Vp,q is a boundary term that measures the cost of assigning labels fp,fq to adjacent pixels p and q.
According to an embodiment of the present invention, both Dp(fp) and Vp,q comprises two terms, one from image data and one from a shape constraint (shape prior) applied to the image. Given an initial shape constraint applied to the image, the mean intensity of pixels inside of the shape μi and the mean intensity of the pixels outside of the shape μo are calculated and used in the image data terms of Dp(fp) and Vp,q. Additionally, a shape mask M is generated. The shape mask M is a binary image, which is 0 inside the shape constraint and 1 outside the shape constraint. The shape mask M is used in the shape constraint terms of Dp(fp) and Vp,q.
Accordingly, the terminal weights, or region term of E, can be expressed using a Gaussian matching function by:
Dp(foreground)=e−I(p)−μ
Dp(background)=e−(I(p)−μ
and the neighbor weights can be expressed using the Gaussian matching function by:
Vp,q=e−(I(p)−I(q))
where I is the intensity of a pixel and σ is a standard deviation of the Gaussian matching function. As expressed in Equations 5-7, the contribution of the shape term in each equation is weighted by a factor λ. This allows the strength of the shape constraint to be adjusted. For example, the larger the value of λ, the less deviation of the graph cut solution from the shape constraint.
In the case of an elliptical shape prior, the initial shape may be an ellipse of a predetermined or specified size that is formed on the image around a seed point on the image (either selected by a user or determined automatically). In the case of a statistical shape prior (i.e., an eigenspace), the initial shape can be the mean shape of the eigenspace.
At step 320, a shape mask M is generated from the current shape {tilde over (C)}. As described above, the shape mask M is a binary image in which pixels inside the shape {tilde over (C)} are assigned the value 0 and pixels outside of the shape prior are assigned the value 1. Accordingly, for a pixel p, M(p)=0 if p is inside {tilde over (C)} and M(p)=1 if p is outside {tilde over (C)}.
At step 330, the mean intensity μi of the pixels inside the shape mask (i.e., inside the shape {tilde over (C)}) as the mean intensity μo of the pixels outside the shape mask (i.e., outside the shape {tilde over (C)}) are calculated. Although this embodiment of the present invention is described using mean intensities, the present invention is not limited thereto, and any statistical measures of the intensities of the pixels inside and outside of the shape {tilde over (C)} can be used.
At step 340, a narrowband is formed around a border of the current shape {tilde over (C)}.
Although
At step 350, a graph is generated using only pixels inside the narrowband formed at step 340, and the energy for the graph is calculated. The energy is calculated based on the image intensity I, the mean intensities μi and μo, and the shape mask M, as expressed in Equations 4-7. Since the graph is set up using only pixels in the narrowband, the image segmentation is locally constrained to the narrowband pixels.
At step 360, the minimum cut of the graph is calculated. As described above, the minimum cut is determined to minimize the sum of the cost of all the edges that are severed by the cut. The minimum cut also results in minimizing the energy function for the pixels included in the narrowband. As illustrated in
At step 370, the current shape C is adjusted to fit to the minimum cut calculated at step 360. More particularly, a new shape is fit to the pixels within the minimum cut C using the shape prior, resulting in a new current shape {tilde over (C)}. At this stage, {tilde over (C)} lies in the space of shapes representable by the shape prior. In the case of an elliptical shape prior, a new ellipse is fit to the minimum cut. For example, a new ellipse may be formed using a least squares algorithm that minimizes an algebraic distance between the pixels within the minimum cut C and the new ellipse. In the case of a statistical shape (i.e., a shape eigenspace), the shape of the minimum cut C is aligned to the shape eigenspace and projecting into the shape space, producing a new shape {tilde over (C)} which is set to be the current shape. According to an embodiment of the present invention, the adjusted shape {tilde over (C)} can be displayed at this step.
At step 380, it is determined whether a stop condition has been met. A stop condition is a condition that indicates that the image segmentation is complete. For example, in one embodiment of the present invention, the stop condition is met when the energy function of the graph converges. In this case, if the difference between a previous value of the energy function and a current value of the energy function is less than an error threshold, the energy function converges and the stop condition is met. According to another embodiment of the present invention, a predetermined number of iterations can be performed before the stop condition is met. If the stop condition has not been met at step 380, the method returns to step 320. Accordingly, the method is repeated, each time adjusting the shape prior until the stop condition is met. If the stop condition is met at step 380, the method proceeds to step 390.
At step 390, the segmented image is output. For example, the segmented image can be output by displaying the segmented image on a screen of a computer, printing the segmented image, storing the segmented image in memory of a computer, or outputting the segmented image to a software program, such as digital image editing software or medical diagnostic software. The output segmented image is segmented into a background portion and an object having the shape of the adjusted shape {tilde over (C)} formed using the shape prior, as adjusted in the final iteration of step 370. In the case of the elliptical shape prior, the object is in the shape of an ellipse. In the case of the statistical shape prior, the object will have a similar shape to the training shapes that are used to define the shape eigenspace.
The steps of the method described above have been described to give a visual understanding of the image segmentation method. It is to be understood, that the steps may be performed within a computer system using images stored within the computer system. Accordingly, some steps of the above-described method can occur as internal representations within the computer system.
The graph cut image segmentation method using a shape prior can be implemented on a computer using well known computer processors, memory units, storage devices, computer software, and other components. A high level block diagram of such a computer is illustrated in
The foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention. Those skilled in the art could implement various other feature combinations without departing from the scope and spirit of the invention.
This application claims the benefit of U.S. Provisional Application No. 60/699,639, filed Jul. 15, 2005, the disclosure of which is herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
6961454 | Jolly | Nov 2005 | B2 |
6973212 | Boykov et al. | Dec 2005 | B2 |
7016111 | Chubachi et al. | Mar 2006 | B2 |
7079674 | Paragios et al. | Jul 2006 | B2 |
7088440 | Buermann et al. | Aug 2006 | B2 |
20030053667 | Paragios et al. | Mar 2003 | A1 |
20050238215 | Jolly et al. | Oct 2005 | A1 |
20050271273 | Blake et al. | Dec 2005 | A1 |
Number | Date | Country | |
---|---|---|---|
20070014473 A1 | Jan 2007 | US |
Number | Date | Country | |
---|---|---|---|
60699639 | Jul 2005 | US |