The disclosure relates to segmenting digital images, and more particularly to segmenting digital images in the case of more than two labels.
A method and system has been invented that allow for image segmentation in the case of more than two labels using a novel graph cut model.
The method is based on graphs that have the form of undirected flow networks. The flow network is an undirected weighted graph, G is defined as a set of nodes or vertices V, a set of edges E and a positive weight function, w associated with the edges:
G=(V, E) and w:E→R+ (1)
where each edge, {p, q} is defined as a pair of nodes p, q∈V. Additionally, a subset of the nodes, L are “label” nodes.
A graph cut, C is defined as a subset of the edges in the graph such that when those edges are removed from the graph the remaining graph consists of |L| connected components and each connected component contains one node from the set L
A minimum-weight graph cut or minimum cut is a graph cut that has a minimum weight of all the possible graph cuts:
where χ is the set of all possible graph cuts.
An exact solution for the graph cut can be obtained for the case with binary labels. Methods for obtaining this solution are based on establishing a flow through the graph. All flows are allowed that conserve flow at each node and, for each edge in the graph, are less than the weight of the edge. Another approach is to establish flow through electrical network models. In these methods, an electrical network is defined using resistors that represent the weights at each edge in the graph. A competitive algorithm for obtaining the graph cut has arisen based on the electrical model. A novel methodology is presented here that generalizes prior methods to allow for image segmentation in the case of more than two labels.
A computer-implemented method for image segmentation is disclosed, the method comprising acquiring a digital image; defining a set of image vertices from the digital image; defining a set of label vertices; constructing a graph from said image vertices and label vertices wherein said graph; calculating a non-negative cost function; constructing an electrical network based upon the constructed graph; simulating the electrical network one time for each label; and segmenting the digital image based upon the voltages obtained from the simulations of the electrical network.
One or more embodiments will now be described, by way of example, with reference to the accompanying drawings, in which:
Various embodiments of the present invention will be described in detail with reference to the drawings. Reference to various embodiments does not limit the scope of the invention, which is limited only by the scope of the claims attached hereto. Additionally, any examples set forth in this specification are not intended to be limiting and merely set forth some of the many possible embodiments for the claimed invention.
Referring now to the drawings, wherein the depictions are for the purpose of illustrating certain exemplary embodiments only and not for the purpose of limiting the same,
A graph is constructed at step 100 from the image or, in one embodiment, a series of images. Each pixel i.e., point, of said image is associated with a vertex of said graph and said graph includes an edge connecting each pair of vertices corresponding to adjacent points in said image. The graph may be defined as G=(V, E, w) with V representing a set of vertices of a graph, and E representing a set of edges of a graph such that every pixel u or v has a corresponding vertex, and w is a set of non-negative values for the weight of each edge in the graph. A set L is defined representing labels. Image segmentation will consist of the assignment of one of these labels to each point in the image.
In one embodiment, optional limited segmentation may be performed by a user-operator 110. The optional limited segmentation, or initial segmentation, may be outputted to the graph construction 120. In one embodiment, optional limited segmentation may be performed by assigning seeds, either interactively or automatically by acquiring landmarks that belong to each partition of the image.
An electrical network will be constructed based on the graph 130. The electrical network is a network of resistors with each resistor representing one of the edges in the graph with the same connectivity as the corresponding edges in the graph. The electrical behavior of the resistors is non-linear and is given by the relation:
where i and x are vectors with |L| components. The vector i is analogous to electrical current and the vector x is analogous to electrical voltage.
Each of the components of the current and voltage vectors are associated with a label in the graph. The relationship between labels and the voltage and current vectors is given by a map θ:
θ:L→{u1, u2, . . . u|L|)} (4)
where u1, u2 . . . are vectors R|V|. For each label, the vector is 1 for the component whose index is that of the node in the graph that is the label. All other components of the vectors are zero.
Input to the electrical network is applied at the nodes in the electrical network corresponding to the nodes in the graph in the set L. A vector-valued voltage is applied to each of the input nodes with the vector direction given by θ:
x
l
=v
inθ(l) (5)
where vin is a large number and l∈L
Constraints are used to obtain a system of equations for the voltage. At each node, p∈V\L the net electrical current is constrained to be zero.
where Np is the set of nodes in the electrical network that are directly connected to the node p
The systems of equations from (5) and (6) can be written in matrix form:
A(X)X=B (7)
where A(X)X represents a factorization of the non-constant terms in the system of equations. X is a |V|×|L| matrix. Each row of X represents the unknown voltage vector at one of the nodes in electrical network, xp. A(X) is a |V|×|V| matrix and B is a |V|×|L| matrix and represents the constant terms in the system of equations.
The system of equations is solved using the fixed point method:
A({tilde over (X)}k){tilde over (X)}k+1=B (8)
where {tilde over (X)} is an approximate solution and k is the iteration. The fixed point solution is initialized with the zero matrix {tilde over (X)}0=0
Partition of the graph and the image 140 is represented by a map, S, from the set of nodes, V, to the indices of the labels, L. The partition is obtained by selecting the label at each node that corresponds to largest component of the voltage at that node.
S={(j, sj)|sj=argmax({tilde over (x)}ij), j∈V} (9)
where sj is the index of the label assigned to the jth node of the graph. {tilde over (x)}ij is the voltage vector at the jth node after the ith iteration.
In one embodiment, the invention is applied to the segmentation of an image of the heart using computed tomography. The computed tomography is shown in
G
CT=(VCT, ECT) and wCT:E→R+ (10)
where VCT=PCT∪LCT where PCT is the set of pixels in the image and LCT is a set of 10 labels. Each node in the image has a value, f assigned to it. For pixel nodes, the value of f is the image intensity of the corresponding pixel. For label nodes, the value of f is the class intensity of the label. The class intensity is a value in the set {−100, −50, 0, . . . 350}. For each edge, epq, the difference in the intensity function f is given by a function δpq:
δpq=|f(p)−f(q)| (11)
The set of edges in the graph includes edges that produce a regularization effect, ECTN and edges that relate the pixel nodes to the labels ECTL:
E
CT
=E
CT
N
∪E
CT
L (12)
ECT
The weight of the edges is given by the function:
For construction of the graph, edges were included based on the 8-neighbor adjacency. The input voltage magnitude was vin=106. A solution was obtained for 10 fixed-point iterations. A direct solver was used at each fixed-point iteration. The algorithm was implemented in Python with the Numpy, Scipy, and Networkx modules. The segmentation result is shown in
In another embodiment, the invention is applied to the segmentation of a photograph for the purpose of depth estimation,
In this embodiment, each label represents the shift in position of where an object appears in photographs taken at two different horizontal positions. That shift is geometrically related to the distance of the object from the cameras. A graph, Gstereo is constructed from a pair of photographs:
G
stereo=(Vstereo, Estereo) and wstereo:Estereo→R+ (14)
The graph, Gstereo has the same structure as the graph used for computed tomography segmentation. The set of nodes in the graph Vstereo are the pixels in the photograph from the left camera and the set of labels Lstereo. Edges are included between nodes of adjacent pixels and between each pixel and all of the label nodes:
E
stereo
=E
stereo
N
∪E
stereo
L (15)
For the edges in Estereo, the weights are given by:
where the function fleft and fright are vectors representing the RGB channels of the color in the left and right photographs, respectively. t(epq) is a map from an edge to a pixel in the right photograph. The location of the pixel in the right photograph is the position of the pixel p shifted to the left by the shift associated with the label node q.
S is defined as:
It is to be understood that the following claims are intended to cover all of the generic and specific features of the invention herein described and all statements of the scope of the invention which, as a matter of language, might be said to fall therebetween.
This application claims benefit of U.S. Provisional Patent Application No. 62/413,940, filed Oct. 27, 2016, and incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62413940 | Oct 2016 | US |