The present invention relates to methods and systems for automatically identifying individual cells in an image.
The principal types of cells present in the peripheral blood are red blood cells (RBCs), white blood cells (WBCs) and platelets.
The observation of blood smear under a microscope provides important qualitative and quantitative information concerning the diagnosis of various diseases including leukemia.
Unlike peripheral blood smear evaluation, the evaluation of bone marrow is limited to those clinical situations in which hematologic or other abnormalities already have been identified and require further characterization or investigation. In other words, a bone marrow evaluation is usually not a baseline test, but rather a confirmatory test used to rule in or rule out specific hypotheses. Hematopoietic cells in the bone marrow give rise to all the blood cell types. They are grouped into Erythrocyte or Normoblast differentiation series, Leukocyte differentiation series and megakaryocytes.
In one embodiment, the invention provides a method of identifying individual cells in an image of a cytological preparation. The method includes the steps of obtaining an image of a cytological preparation including a plurality of cells; identifying a first region of the image, the first region having a region boundary encompassing at least one lobe, wherein the first region includes at least one cell; detecting at least one circle within the first region, where the at least one circle substantially covers the at least one lobe of the first region; and if the first region has more than one circle, splitting the region into at least two subregions.
In another embodiment, the invention provides a computer-readable medium which includes first instructions executable on a computational device for obtaining an image of a cytological preparation including a plurality of cells; second instructions executable on a computational device for identifying a first region of the image, the first region having a region boundary encompassing at least one lobe, wherein the first region includes at least one cell; third instructions executable on a computational device for detecting at least one circle within the first region, where the at least one circle substantially covers the at least one lobe of the first region; and fourth instructions executable on a computational device for splitting the region into at least two subregions if the first region has more than one circle.
In yet another embodiment, the invention provides a computer-based system for identifying individual cells in an image of a cytological preparation. The system includes a microprocessor and a storage medium operably coupled to the microprocessor. The storage medium includes program instructions executable by the microprocessor for obtaining an image of a cytological preparation including a plurality of cells; identifying a first region of the image, the first region having a region boundary encompassing at least one lobe, wherein the first region includes at least one cell; detecting at least one circle within the first region, where the at least one circle substantially covers the at least one lobe of the first region; and if the first region has more than one circle, splitting the region into at least two subregions.
Other aspects of the invention will become apparent by consideration of the detailed description and accompanying drawings.
Before any embodiments of the invention are explained in detail, it is to be understood that the invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the following drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways.
In various embodiments, the system includes a computer with a microprocessor, memory, data storage, and input and output capabilities. The computer obtains digital images of cytological slides (e.g. from a digital camera or slide scanning system), where the samples on the slides may be stained so that the nuclei are dark (e.g. blue) and the cytoplasm is a contrasting color (e.g. pink), for example using Wright's stain or a related stain combination; other combinations of transmitted light or fluorescent dyes and labels are also possible, including, for example, Hoechst nuclear dye and F-actin phalloidin stain. The computer may be programmed to perform any of the methods described herein, e.g. using the microprocessor, and display results of performing the methods at various stages. For example, the computer may store and/or display to a user (e.g. a clinician) data similar to the images disclosed herein. In addition, the computer may accumulate numerical results, e.g. numbers of cells in an image or a portion thereof, to store and/or display. Thus, in various embodiments, the invention includes a computer-readable medium containing instructions for the microprocessor to perform the methods disclosed herein. The methods and systems described herein can be used to identify individual cells and to automatically identify single cells present in adjacent groups or clumps and to further segment the cells into nuclear and cytoplasmic regions. Image intensity data in the segmented regions can be quantified, e.g. using the grayscale values or using an adjusted value based on a calibration. The methods and systems can be used with a number of cell types, such as those exemplified herein as well as other types including primary cells and cultured cells, including cells of a hepatoma cell line.
In one embodiment, the invention includes a method of segmenting cells in an image of a cytological preparation. The method includes a step of obtaining an image of a cytological preparation including a plurality of cells. The method also includes a step of identifying a first region of the image, the first region having a region boundary encompassing at least one lobe, wherein the first region includes at least one cell. The method further includes a step of detecting at least one circle within the first region, where the at least one circle substantially covers the at least one lobe of the first region. If the first region has more than one circle, the method includes a step of splitting the region into at least two subregions. In various embodiments, the step of identifying at least one circle within the first region includes identifying at least one circle within the first region using a circular Hough transform. In certain embodiments, each of the at least two subregions includes a cell.
The step of splitting the region into at least two subregions includes, in certain embodiments, steps of i) identifying a plurality of maximum curvature points on the region boundary, ii) generating a plurality of vectors connecting pairs of the plurality of maximum curvature points, and iii) eliminating invalid vectors to produce a set of remaining vectors. For each of the remaining vectors, the method includes steps of a) determining a tangent for each of the maximum curvature points, where each tangent has an angle, b) comparing the angles of the tangents, and c) eliminating the remaining vector if the difference between the angles of the tangents is not approximately equal to pi radians to produce a set of final vectors. Finally, splitting the region into at least two subregions includes a step of v) splitting the region into at least two subregions using at least one of the final vectors.
The step of eliminating invalid vectors includes, in some embodiments, eliminating any of the plurality of vectors which cross the region boundary or which are not within the first region. The step of generating a plurality of vectors connecting pairs of the plurality of maximum curvature points includes generating a plurality of vectors connecting pairs of the plurality of maximum curvature points using Delaunay triangulation.
In various embodiments, the method further includes a step of validating the final vectors.
In further embodiments, the step of identifying a first region includes steps of thresholding the image, subtracting a first color channel of the image from a second color channel of the image to selectively remove a first set of image features, applying a hole filling algorithm, and applying a disk-shaped structuring element.
In still further embodiments, the each of the at least two subregions includes a cell. In other embodiments, the first region includes a plurality of lobes, wherein each of the plurality of lobes has a cell therein, and wherein splitting the region into at least two subregions includes splitting the region such that each subregion includes one of the plurality of lobes.
In other embodiments, the method includes a step of segmenting at least a portion of the first region into a nuclear region and a cytoplasmic region. The step of segmenting at least a portion of the first region into a nuclear region and a cytoplasmic region can include one or more of thresholding the first region based on intensity, thresholding the first region based on RGB color channel differences, applying a disk-shaped structuring element, identifying a boundary of a nucleus to obtain the nuclear region, and subtracting the nuclear region from the first region to obtain the cytoplasmic region. The method, in yet other embodiments, includes a step of quantifying a staining intensity in at least one of the nuclear region and the cytoplasmic region.
In various embodiments, at least one step of the method is executed using a microprocessor. In other embodiments, the method includes a step of displaying a result to a user on an output device, wherein the result is one of an image or a numerical value.
In some embodiments, the invention includes a computer-readable medium containing program instructions for implementing a method of identifying individual cells in an image of a cytological preparation, wherein execution of the program instructions by a microprocessor of a computer system causes the microprocessor to carry out the steps of the method of any of the preceding claims. In other embodiments, the invention includes a computer system for identifying individual cells in an image of a cytological preparation, the computer system including a microprocessor, memory, data storage, and input and output capabilities, wherein the computer system executes the program instructions contained on the computer-readable medium.
Disclosed herein is a methodology to achieve an automated system for the segmentation of hematopoietic cells from scanned slide microscopic images. This segmentation technique is effective in segmenting both individual and clumped hematopoietic cells. The similar intensity regions that correspond to noise have been eliminated.
Also disclosed is a methodology to achieve an automated system for the detection and classification of normal WBCs from scanned slide microscopic images. The subtypes of WBCs identified are basophil, eosinophil, lymphocyte, monocyte, band and segmented neutrophil. Experiments show that the two step classification implemented achieves a 93.9% and 88.4% overall accuracy in the 5 subtype and 6 subtype classification respectively. Results indicate that the morphological analysis of bloods white cells is achievable and it offers good classification accuracy.
Disclosed is an automated system for differential white blood cell (WBC) counting can make the classification of blood cells much faster and less tedious for the technician. We present an automated system for the segmentation and classification of the WBC in the peripheral blood smear from scanned slide images. A new segmentation scheme using color information and morphology is proposed. Then, as the first step of a two-step classification process, a WBC is broadly classified into cells with segmented nucleus and cells with non-segmented nucleus. The nucleus shape is the key factor in deciding which general class the WBC belongs to. Ambiguities associated with connected nuclei lobes are resolved by detecting maximum curvature points and partitioning them using geometric rules. The second step is to obtain a desired set of features using the information from the cytoplasm and nucleus regions to classify the WBC to its final type. We use these features with Linear Discriminant Analysis. This novel two-step classification approach stratifies normal WBC types accurately. System evaluation is performed using 10-fold cross-validation technique. Confusion matrices of the classifiers is presented to evaluate the accuracy for each type of WBC detection.
Bone marrow evaluation are indicated when peripheral blood abnormalities are not explained by the clinical, physical, or laboratory findings. In the bone marrow, the process of locating, identifying and counting of hematopoietic cells manually is tedious and time consuming for the technician. We propose an extension to the previous segmentation method discussed for the peripheral blood smear. We use color information and morphology to eliminate red blood cells and background. The clumped cells are segmented using circle detection on the morphologically processed image and a splitting algorithm based on the detected circle centers. Circular Hough Transform is used for circle detection to find the position and number of circle centers in each region. The splitting algorithm is based on detecting the maximum curvature points, and partitioning them based on information obtained from the centers of the circles in each region. The performance of the segmentation algorithm for hematopoietic cells is evaluated on a set of 3,748 cells.
The presently-disclosed methods for segmentation of hematopoietic cells are different from known approaches in several respects. For example, clumped cells are differentiated from single cells so that circle detection and cell-splitting algorithms are only applied to the clumped cells. In addition, RBC are eliminated before detecting the hematopoietic cells.
We describe herein the development of an automated system for the segmentation and classification of the WBC in the peripheral blood smear from scanned slide images. In addition, we describe the development of an algorithm for segmentation of hematopoietic cells in bone marrow from scanned slide images.
This study leads to development of image analysis in the field of pathology and development of an automated tool to identify and classify the WBC from peripheral blood smear and hematopoietic cells in the bone marrow. The results of this study are intended to lead to the replacement of the manual counting of WBCs and hematopoietic cells, which is a tiresome and time-consuming process, by an automated system that is objective and consistent. The proposed automated morphological classification system is a first step towards the automatic detection/monitoring of blood pathologies such as various leukemia types by inspecting the variation in morphology of WBCs.
WBC Segmentation in Peripheral Blood Smear
Segmentation is a critical step in many image analysis problems. The segmentation step is crucial because the accuracy of the subsequent feature extraction and classification step depends on correct segmentation of WBCs. It is also a difficult and challenging problem due to the complex appearance of these cells, uncertainty and inconsistencies in the microscopic image with varying illumination. Improvement of cell segmentation has been the most common effort in many research efforts. Many automatic segmentation methods have been proposed, most of them based on local image information such as histogram of regions, pixel intensity, discontinuity and clustering techniques.
Many of the segmentation algorithms are based on the edge information present in images. We discuss several of these approaches next. As proposed by Ongun, WBCs were segmented using active contour models (snakes and balloons) which were initialized using morphological operators. This method works well only if the WBC are distinctly separate from the red blood cells and have a dark cytoplasm.
Kumar defined a new edge operator, the Teager energy operator, to highlight the nucleus boundary which is very effective for segmenting the nuclei in cell images. Kumar used a simple morphological method to segment the cytoplasm from the background and the RBCs. The cytoplasm segmentation works well when the RBC and WBC are not close to each other. In contrast, our method works well even with a complicated background. Jiang introduced a novel WBC segmentation scheme by combining scale-space filtering and watershed clustering. Scale space filtering is used to obtain the nucleus from the sub-image. Watershed clustering in 3-D HSV (Hue, Saturation, Value) histogram is processed to extract the cytoplasm. However, this method may not be sufficient in the case of high density of cells, where the RBC are clustered close to the WBC.
To overcome the difficulty associated with the high density of cells, Dorini introduces the use of some simple morphological operators and explores the scale-space properties of a toggle operator to improve the segmentation accuracy in the WBCs. To avoid leaking, a common problem in cell images due to the low contrast between nucleus, cytoplasm and background, they used a scale-space toggle operator for contour regularization. This shows the importance of the multiscale. In this method, the cytoplasm segmentation presents a few limitations. The RBC touching the WBC also get detected as part of the cytoplasm. In our method we eliminate the RBCs before detecting the WBC.
The above-mentioned algorithms are based on edge information. Edge detection methods do not work very well when not all cell details are sharp. But these methods work well if the contrast between the background and the gray internal membrane of the cell is stretched using a contrast stretching filter as stated by Piuri and Scotti. However, the problem of overlapping red and white blood cells still remains unresolved. Sadeghians method used Zacks simple thresholding method for cytoplasm segmentation, which is based on the fact that the color intensity of RBC in a blood image is quite different from that of cytoplasm. The nuclei segmentation was done using a GVF (Gradient Vector Flow) snake.
A few approaches that were introduced recently dealt only with the segmentation of the nucleus of the WBC. For instance, Hamghalam and Ayatollahi used histogram analysis and measurement of distance among nuclei. The thresholding point was chosen based on the histogram analysis. Nuclei whose distances are less than the diameter of the WBC were merged.
Rezatofighi introduced a novel method based on orthogonality theory and Gram-Schmidt process for segmenting the WBC nuclei. In these two approaches, cytoplasm segmentation was not addressed. We address cytoplasm segmentation in addition to nuclei segmentation. The successful segmentation of the cytoplasm along with the nucleus segmentation aids in the automatic classification of the WBC. As a part of our preliminary studies we implemented the segmentation of WBCs based on the algorithm proposed by Dorini.
WBC Classification in Peripheral Blood Smear
WBCs are classified according to the characteristics of their cytoplasm and nucleus. The WBCs are classified into five classes, i.e., monocyte, lymphocyte, neutrophil, eosinophil and basophil. Pathologists mostly consider this five subtype classification. The neutrophils can be further subdivided into segmented neutrophils and bands. Since the chosen features affects the classifier performance, deciding which features must be used in a specific data classification problem is as important as the classifier itself. Hematology experts examine the shape of the cells and the nuclei, color and texture of the cells. It is important to reflect the rules and heuristics used by the hematology experts in selection of the features. Several researchers have previously proposed features to differentiate the WBCs.
As proposed by Ongun, several types of features such as intensity and color based features, texture based features and shape based features are utilized for a robust representation of the WBC. Classification method used in this work includes k-Nearest Neighbors (k-NN), Learning Vector Quantization (LVQ), MultiLayer Perceptron (MLP) and Support Vector Machine (SVM).
Piuri and Scotti evaluated the binarized cytoplasm membrane and the nucleus to characterize the feature set. The standard set of features like area, perimeter, convex areas, solidity, major axis length, orientation, filled area, eccentricity were separately evaluated for the nucleus and the cytoplasm. In addition special features like the ratio between the nucleus and the cell areas, the nucleus' “rectangularity” (ratio between the perimeter of the tightest bounding rectangle and the nucleus perimeter), the cell “circularity” (ratio between the perimeter of the tightest bounding circle and the cell perimeter), number of lobes in nucleus, area and mean gray-level intensity of the cytoplasm were computed. Their system was evaluated using 10 fold cross-validation. The performance was compared using different classifiers like nearest neighbor classifiers (kNN), Feed-forward neural network (FF-NN), Radial Basis Function neural network (RBF), parallel classifier built with feed-forward neural network. In our method we suggest a preliminary classification of the WBC based on the number of lobes (single or multi-lobed) in the nucleus along with the feature set to get a better classification rate for each of the WBC subtype. As a part of the preliminary studies the features we evaluated were based on features selected by Scotti and Piuri.
Our approach aims at achieving a robust scheme to identity the WBC accurately. False negatives in identification of WBC can lead to wrong diagnosis. We propose a segmentation scheme based on the difference in color channels and morphological operations to segment the WBC. The algorithm has low computation cost but good accuracy. Two-step classification with the aid of a comprehensive feature set is helpful in realizing better accuracy rates in the classification of the WBCs. We validate our approach on 320 images with 1,938 cells.
Segmentation of Hematopoietic Cells in Bone Marrow
Image segmentation in the bone marrow is a difficult and challenging problem due to the complex appearance of these cells, uncertainty and inconsistencies in the microscopic image with varying illumination. There is fairly a wide variation of size and shape of nuclei and cytoplasm regions within given cell classes, making the segmentation problem a bigger challenge. The maturity classes of the cells in the bone marrow actually represent a continuum. Furthermore, cells frequently overlap each other. Improvement of cell segmentation has been the most common effort in many research efforts. As proposed by Ongun, hematopoietic cells can be segmented in a manner similar to the segmentation of the blood cells in the peripheral blood smear. A few approaches deal only with the segmentation of the Leukocyte differentiation series in the bone marrow samples, which reveals important diagnosis information about patients. Most of the segmentation methods for the Leukocyte differentiation series in bone marrow are based on Fuzzy Logic. Park and Keller introduced a technique based on the Principle of Least Commitment for the segmentation of the Leukocyte differentiation series in the bone marrow. The watershed algorithm is used to perform an “over-segmentation” of the image where each primitive patch is no bigger than one of the cell components. The patch label memberships were relaxed in order to obtain more consistent labels for merging into cell objects. Similarly, Sobrevilla uses an approach based on fuzzy techniques to segment the Leukocyte differentiation series in bone marrows. They detect the interest regions, containing cells that belong to the Leukocyte differentiation series and no interest regions, containing background, red blood cells using local features like gray level intensity, homogeneity. Hematopoietic cell segmentation in the bone marrow is in a rudimentary stage and has a lot of scope for improvement.
An algorithm based on morphological watersheds was proposed by Malpica et al. and tested on the segmentation of microscopic nuclei clusters. The method fails when multiple nuclei exist in a single cell. Our method uses circle detection to find the number of cells in a given region, segments cells with multiple nuclei correctly. The methods proposed by Berge et al., Kong et al., Wen et al. are studied for splitting clumped cells.
The present disclosure describes a robust scheme to identify the WBC accurately in peripheral blood smear. A segmentation scheme with low computation cost but good efficiency has been implemented. In addition to the evaluation of the peripheral blood smear, we have segmented the hematopoietic cells from aspirate smears in the bone marrow. Our segmentation algorithm is based on a novel application of the Hough Transform to find circles in images and a splitting algorithm based on the detected circle centers.
This portion of the disclosure includes four sections. The first section gives a brief overview about mathematical morphology operators used in this work. The second section describes the algorithm for the detection and classification of WBCs in peripheral blood smear. The third section elucidates the first step of a two-step classification process, a WBC is broadly classified into cell with segmented nucleus and cell with non-segmented nucleus. The last section describes the set of features used to classify the WBC to its final type.
Morphological Tools
Mathematical morphology has been used in many operations for the segmentation of the nucleus and the cytoplasm. The most important morphological tools used are erosion and dilation.
Dilation: With A and B as sets in Z2, the dilation of A by B, is defined as
A⊕B={z|({circumflex over (B)})z∩A≠φ} (3.1)
This equation is based on reflecting B about its origin, and shifting this reflection by z. The reflection of set B, denoted by {circumflex over (B)}, is defined as
{circumflex over (B)}={w|w=−b, for bεB}.
The dilation of A by B then is a set of all displacements, z, such that {circumflex over (B)} and A overlap by at least one element. B is commonly referred to as the structuring element.
Erosion: With A and B as sets in Z2, the erosion of A by B, is defined as
A⊖B={z|(B)zA} (3.2)
In words, this equation indicates that erosion of A by B is the set of all points z such that B, translated by z, is contained in A.
Opening: Morphological opening generally smoothes the contour of an object, breaks narrow isthmuses, and eliminates thin protrusions. The opening of a set A by structuring element B, denoted as A∘B is defined as
A∘B=(A⊖B)⊕B (3.3)
Closing: Closing tends to smooth sections of contours, it generally fuses narrow breaks and long thin gulfs, eliminates small holes and fills gaps in the contour. The closing of a set A by structuring element B, denoted as A•B is defined as
A•B=(A⊕B)⊖B (3.4)
Detection and Segmentation of White Blood Cells from Peripheral Blood Smear
We developed an automated system for the detection of the WBCs from blood smears. WBC counting is performed by pathologists only in thin sections of the blood smear since they have a single layer of cells. Similarly, we manually pick these regions and store the images corresponding to the thin sections of the blood smear. Each of these images has 2 to 15 WBCs.
There are three types of cells in normal human blood: RBC, WBC, and platelets. Generally, RBC are simple and similar to each other. WBC contain nucleus and cytoplasm and there are different subtypes. For easy identification, the WBCs are stained with the Geisma stain, hence have a dark intensity. Since we have been using images from the same laboratory, the stains used on the slides are similar. We convert the image from RGB (Red, Green, Blue) to HSV (Hue, Saturation, Value) space. First, using the saturation image, we threshold it to obtain the nucleus approximately. Saturation refers to the dominance of hue in the color. A saturation value of 0 implies that the color is desaturated, the domination of hue is less. A saturation value of 1 indicates that there is maximum dominance of hue, pure color, thus the range for the saturation image is between 0 and 1. A threshold value of 0.55 is used in our experiment. We eliminate the extremely small regions based on their area. The advantage of using saturation image for thresholding is the elimination of illumination changes that occur in the image.
We have implemented a novel and robust segmentation scheme for segmenting WBCs. First we threshold (threshold value=0.9) the given sub-image based on intensity and eliminate small regions to obtain a binary image. The intensities range between 0 and 255. The intensity images are scaled between 0 and 1 for thresholding. This eliminates the background pixels, leaving only the RBCs and WBC. We notice that the RBC are mostly pink/red in color and the WBC have a dark stained nucleus. We take advantage of the information present in the blue and red channels. Thresholding (threshold value=0.05) and smoothing the difference in the red and blue channels helps in eliminating a significant part of the RBCs. Subtracting the RBCs and the background from the original image results in an image with mostly WBC. Hence, we are left with the WBC which have possibly small parts of RBCs attached to them. We fill in details if lost due to our thresholding operation. A standard hole filling algorithm is used to fill in these details. Using a disk shaped structuring element we get rid of the thin parts (morphological opening) of the RBCs which are attached to the WBC. This procedure helps in detecting the boundary of the WBC. Once the WBC boundary has been extracted we concentrate on separating the WBC region into cytoplasm and nucleus. We use the saturation image again to threshold the detected WBC. This thresholding (threshold value=0.55) operation yields a binary image of the nucleus. By taking the difference between the WBC image and the nucleus image we get the cytoplasm regions. Hence this extremely fast method can be used to segment the blood image to get the desired results.
Segmented Vs. Non-Segmented Nucleus Classification
WBCs can be broadly classified into cells with segmented nucleus and cells with non-segmented nucleus as shown in
Detection of Maximum Curvature Points
The goal of this step is to identify local curvature maxima corresponding to sharp bends (corners). We utilize global and local curvature properties in extracting the maximum curvature points. After contour extraction we compute the curvature using Equation (3.5) and retain the local curvature maxima points.
where x(s) and y(s) represents the x-coordinates and y-coordinates of the boundary points respectively. x′ and y′ represent the first derivatives with respect to s. x″ and y″ represent the second derivatives with respect to s.
Elimination of low curvature maxima is done by calculating an adaptive threshold according to the mean curvature within a region of interest. The region of interest (ROI) of a maximum curvature point is defined as the segment of the contour between the two nearest curvature minima points surrounding it denoted by L1 and L2. The ROI of each maximum curvature point is used to calculate a local threshold adaptively where p is the position of the maximum curvature point on the contour, and R is a coefficient:
{acute over (κ)} is the mean curvature of the ROI and i is the index of the point on the nucleus boundary.
The absolute value of the curvature is used to distinguish between low curvature maxim points against high curvature maxima points. A round corner (low curvature maxima) tends to have an absolute curvature larger than T(p), while a sharp corner (high curvature maxima) tends to have an absolute curvature larger than T(p). The reasoning for choosing an appropriate value of R can be found in He et al. A value of 1.5 is used for R in our experiment.
Delaunay Triangulation of Points of Maximum Curvature
The goal of this step is to construct the set of all potential edges that might correspond to the boundaries between the different lobes of a segmented nuclei. The high curvature maxima found in the previous step are used as the candidate vertices for these edges. A triangle S from T satisfies the Delaunay criterion if the interior of the circumcircle through the vertices of S does not contain any points. If all triangles satisfy the Delaunay criterion then the triangulation T is called Delaunay triangulation. We apply the Delaunay Triangulation (DT) to all points of maximum curvature to find candidate edges which potentially separate different segments of the nuclei. The results for an example contour are shown in
Rules to Retain Necessary Edges
Shape and color based rules: A set of conditions are first checked to see if a WBC has segmented nuclei.
We used λ1=0.75, λ2=0.85, λ3=0.3.
Geometric rules: If the above reasons are not satisfied, a series of geometric rules helps us in separating the cells with segmented nucleus from the cells with non-segmented nucleus. Let sij be the edge connecting two points of maximum curvature maxima pi and pj. Ti and Tj are the unit vectors representing the tangent directions at pi and pj. The following set of rules are used:
Valid edges are those for which the tangent vector and the edge vector are nearly perpendicular to each other. This condition retains edges in which the angle between the edge vector and the corresponding tangent vector is close to π/2 radians. (
We used Th1=0 and Th2=0.4.
We check the above conditions for each edge in the Delaunay triangulation. Finally we check if the number of edges are greater than zero. If edges still remain then we classify the WBC as a cell with segmented nucleus. Otherwise, the WBC is classified as a cell with non-segmented nucleus.
An example of this classification is shown in
Automated Classification of WBC to its Subtype.
To classify the WBC to its respective subtype, we use features that describe the characteristics of the cytoplasm and the nucleus. We choose 19 features such as area, perimeter, convex area, solidity, orientation, eccentricity, separately evaluated for the nucleus and the WBC. The result obtained from the previous step gives us information about the broad nucleus type (segmented or non-segmented). This result is a novel binary feature added to our classifier. In addition special features like “circularity” (ratio between the perimeter of the tightest bounding circle and the nucleus perimeter) of the nucleus, nucleus to cytoplasm ratio, ratio of nucleus area to area of WBC, entropy of the cytoplasm and mean gray-level intensity of the cytoplasm (all 3 color channels) are computed. Fishers linear discriminant is used to reduce our multi-dimensional data set to six dimensions. We use a linear discriminant in this 6-dimensional space to classify the data to its respective type. The classifier is biased using the number of samples in each class. The system is evaluated using 10-fold cross-validation.
Segmentation Results
This system has been tested using images obtained from the ARUP Laboratory, University of Utah. Our dataset consists of 320 images that contain 1,938 expert labeled WBCs. The input images have been processed to detect the WBCs. We obtained 1,938 sub-images with single correctly positioned WBCs. We did not have any false negatives in the detection of WBCs. The false positive rate was 10.25%. We eliminated the false positives by manually assigning them to a noise class. The performance of the segmentation algorithm was evaluated by comparing our proposed method with a hematologist's visual segmentation. The segmentation algorithm was applied to 1,938 subimages of WBCs, 1,804 of them were accurately segmented. The accuracy rate for segmentation was 93.08%.
Classification Results
A vector of 19 features were extracted for every WBC. The experts associated the correct classification of each extracted WBC. The resulting dataset (1,938 (number of cells)×19 (features)) has been used to determine the subtype of the WBC segmented using linear discriminants as described herein. The system was evaluated using 10-fold cross-validation technique.
We observed that there is a lot of misclassification between segmented neutrophils and bands, segmented neutrophils and eosinophils, segmented neutrophils and lymphocytes, segmented neutrophils and monocytes as seen in Table 4.1. The rows in the confusion matrix represent the real subtype and the columns represented the detected subtype. An initial classification of the WBC into cells with segmented nucleus (neutrophils, eosinophils, basophils) and cells with single nucleus (band, lymphocyte, monocyte) improves the performance of the classification. The accuracy of classification is improved for all the classes, see Table 4.2.
2
734
33
4
242
40
33
1124
48
8
362
97
We see that one of the important misclassification still prevalent is in the band class. The data from the class “Band” and class “Segmented Neutrophil” seem to overlap each other because of similar features. The segmented neutrophils are just a mature stage of the bands, hence they have similar features. It is difficult to differentiate between these two subtypes.
1289
46
8
362
97
Results indicate that the morphological analysis of white blood cells is achievable and it offers good classification accuracy.
We developed an automated system for the detection and segmentation of hematopoietic cells from aspirate smears in the bone marrow. Hematopoietic cell counting is performed by pathologists only in thin sections of the aspirate smear. Similarly, we manually pick these regions and store the images corresponding to the thin sections of the aspirate smear. There are three main categories of hematopoietic cells: Erythrocyte or Normoblast differentiation series, Leukocyte differentiation series and megakaryocytes. Erythrocyte or Normoblast differentiation series help in the production of RBC. Leukocyte differentiation series help in the production of WBCs. Megakaryocytes are responsible for the production of platelets.
Hematopoietic Cell Segmentation in Bone Marrow
We implemented a novel scheme for segmenting the hematopoietic cells in bone marrow. First we threshold the given image based on intensity and eliminate small regions to obtain a binary image (threshold value=0.7). This eliminates the background pixels, leaving only the RBC, clumped platelets and the hematopoietic cells. We notice that the RBC are mostly pink in color and the hematopoietic cells have a dark stained nucleus. We take advantage of the information present in the blue and red channels. Thresholding and smoothing the difference in the red and blue channels helps in eliminating a significant portion of the RBCs (threshold value=0.05). Subtracting the RBCs and the background from the original image results in an image with mostly hematopoietic cells and clumped platelets. Hence, we are left with the hematopoietic cells which have possibly small parts of RBCs attached to them. We fill in details if lost due to our thresholding operation. A standard hole filling algorithm is used to fill in these details. Using a disk shaped structuring element we get rid of the thin parts (morphological opening) of the RBC which are attached to the hematopoietic cells.
Cell regions that are very close to each other are detected as a single cell. The clumped cells are segmented using circle detection on the morphologically processed image and a splitting algorithm based on the position and number of detected circle centers. We used the Circular Hough Transform to detect the circles present in each connected region. Given the equation for a circle, (x−a)2+(y−b)2=r2, the detection of circles require a 3D parameter space (a; b; r). An accumulator array is built with the following steps:
1. First we find all edges in the image using the color thresholding and morphological filtering procedures described above.
2. Then, at each edge point we increment the accumulator cells corresponding to the circle in the parameter space with center in the point with the desired radius.
3. Step 2 is repeated for all admissible radii.
The accumulator cells then contain numbers corresponding to the number of circles that fit the specific parameters represented by the cells. We find one or several local maxima in the accumulator, which correspond to the circles in the image. The detected center and the radius of the circle is saved for every region.
Circle detection helps in identifying the number of centers in a given region. The regions containing one or more cells generally have one or more lobes, where each lobe typically corresponds to a single cell. Each circle that is detected by the procedure outlined above substantially covers a lobe of the region. The splitting of these lobes, described below, then helps provide a count of the number of single cells in the cluster that is encompassed by the region. An example of circular detection on the morphologically processed image is shown in
1. Detection of maximum curvature points on the boundary.
2. Delaunay Triangulation of Points of Maximum Curvature. This step is used to construct the set of all potential edges that might correspond to the boundaries between the different cells that are clumped together. Let su be the edge connecting two points of maximum curvature maxima pi and pj. Ti and Tj are the unit vectors representing the tangent directions at pi and pj.
3. Eliminate edges that pass through the background and edges that intersect the boundary. This criterion helps in retaining edges that are inside the region only. FIG. (5.5(c))
4. Eliminate edge sij if (Ti·Tj)>0, this retains edges only if the angle between the tangents Ti and Tj is close to π radians. These edges have the two endpoints on opposite sides of the contour, hence this is a valid edge.
5. The edge points M and N are valid when xm, xnε(xp, xq)∪ym, ynε(yp, yq).
6. The distance between midpoint of PQ and MN should be less than a threshold, e.g. T=15 pixels.
7. The edge is valid if |d1−d2|+d3−d4|<e.g. 25 pixels, i.e. the endpoints are approximately equidistant from the edges. If no edges are found then the neighboring center is chosen once again to get the next closest center and the last three steps are repeated.
Steps 5-7 represent automated mechanisms for validating the edge as a valid location to split the adjacent lobes of the first region.
Regions that have clumped platelets or RBCs get detected as possible cells. We consider each of the regions detected, threshold it at a high intensity value to retain only the nucleus. The thresholded regions are processed using morphological opening. If the number of objects in the morphologically processed region is zero then the region is eliminated. In this manner the region containing only platelets or RBCs can be eliminated.
This system has been tested using images obtained from the ARUP Laboratory, University of Utah. Our dataset consists of 334 images that contain 3,748 expert-labeled hematopoietic cells. The false negative rate was 1.62% in the detection of hematopoietic cells. The false positive rate was 1.2%, 3,525 of them were accurately segmented. The performance of the segmentation algorithm was evaluated by comparing our proposed method with a hematologist's visual segmentation. The accuracy rate for segmentation was 94.05%. FIGS. (6.1, 6.2, 6.3) show segmentation results.
As discussed above, in various embodiments, the disclosed methods may be implemented on one or more computer systems 12 such as that shown in
In some embodiments, implementation of the disclosed methods may include generating one or more web pages for facilitating input, output, control, analysis, and other functions. In other embodiments, the methods may be implemented as a locally-controlled program on a local computer system which may or may not be accessible to other computer systems. In still other embodiments, implementation of the methods may include generating and/or operating modules which provide access to portable devices such as laptops, tablet computers, digitizers, digital tablets, smart phones, and other devices.
Each of the following references is hereby incorporated by reference in its entirety:
Thus, the invention provides, among other things, a {text}. Various features and advantages of the invention are set forth in the following claims.
This application claims the benefit of U.S. provisional application No. 61/546,518 filed Oct. 12, 2011, which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61546518 | Oct 2011 | US |