The present invention is directed to a method for object detection in images using a probabilistic boosting cascade tree (PBCT). Embodiments of the present invention are described herein to give a visual understanding of the motion layer extraction method. A digital image is often composed of digital representations of one or more objects (or shapes). The digital representation of an object is often described herein in terms of identifying and manipulating the objects. Such manipulations are virtual manipulations accomplished in the memory or other circuitry/hardware of a computer system. Accordingly, is to be understood that embodiments of the present invention may be performed within a computer system using data stored within the computer system.
An embodiment of the present invention in which a PBCT is trained and used to detect lymph nodes in a CT volume is described herein. It is to be understood that the present invention is not limited to this embodiment and may be used for detection of various objects and structures in various types of image data. The present invention can also be applied to any other type of data classification problem.
As described above, cascades and probabilistic boosting trees have various advantages and disadvantages. Accordingly, it is desirable to utilize the advantages of both structures. For example, it is possible to put a number of cascades before a PBT structure in order to filter out a percentage of the negative samples before processing data using the PBT to learn a more powerful classifier for the samples remaining after the cascades. However, this approach requires that the number of cascades be manually tuned or selected by a user. If the classification problem is easy, more cascades should be used, and if the classification problem is difficult, cascades before the PBT may be useless. Thus, the number of cascades has to be tuned by a user by trial and error. Furthermore, this approach does not allow for cascades inside of the PBT. At a node inside, a learned classifier may be quite effective. In this case, it is not necessary to split the samples into two child nodes and train both nodes, as is required by a tree node in a PBT. Accordingly, embodiments the present invention provide an adaptive way to take advantages of both the tree and cascade structures in a PBCT. The structure of a PBCT includes both cascade nodes and tree nodes and is adaptively tuned on-line based on the training data without any user manipulation or input. Thus, within a PBCT, nodes which perform effective classification can be treated as cascade nodes and discard negatively classified data, while nodes which are less effective are treated as tree nodes, and split the data into two child nodes to be further classified.
At step 402, training data is received at a current node. The training data can be annotated to show positive and negative sample.
Returning to
At step 406, the performance of the classifier trained for the current node is evaluated based on the training data. Accordingly, the training data is used to test the classifier trained for the current node in order to calculate a detection rate and a false positive rate. The detection rate is a measure of a percentage of positive samples in the training data that were classified as positive, and the false positive rate is a measure of a percentage of negative samples in the training data that were classified as positive. If the data for that node is relatively easy to classify, the classifier will have a high detection rate and a low false positive rate. If the data is relatively difficult to classify, the classifier will have a low detection rate and a high false positive rate. Accordingly, in order to evaluate the performance of the trained classifier, the detection rate can be compared to a first threshold, and the false positive rate can be compared to a second threshold.
The training method performs alternate steps depending on the evaluated performance of the trained classifier. If the trained classifier has a high detection rate and a low false positive rate (408), the method proceeds to step 412. For example, if the detection rate is greater than or equal to the first threshold and the false positive rate is less than or equal to the second threshold, the method can proceed to step 412. If the trained classifier has a low detection rate or a high false positive rate (410), the method can proceed to step 414. For example, if the detection rate is less than the first threshold and the false positive rate is greater than the second threshold, the method can proceed to step 414. According to an advantageous embodiment of the present invention, the first threshold can be 97% and the second threshold can be 50%, but the present invention is not limited thereto.
At step 412, the current node is set as a cascade node. Accordingly, the current node will have one child node in the next level of the tree and only the training data classified as positive by the current node will be used to train the child node. The training data classified as negative by the current node is discarded with no further processing or classification.
At step 414, the current node is set as a tree node. Accordingly, the current node will have two child nodes in the next level of the tree. One of the child nodes will be trained using the training data classified as positive by the current node, and one of the child nodes will be trained using the training data classified as negative by the current node. Accordingly, the structure for a next level of the tree is not known until the prior level is trained. Thus, the structure of the PBCT is automatically constructed level by level during the training of the PBCT.
For each node in the PBCT, the training method determines whether the number of training samples for the node is less than a certain threshold. If the number of training samples is less than the threshold, the node will not be further expanded such that no child nodes are generated for that node. Accordingly, the structure of the PBCT is determined such that each branch of the PBCT ends in a terminal node at which there is a relatively small number of training samples.
At step 704, voxel of the CT volume that are not within an expected intensity range of the lymph nodes are discarded. The voxel intensities in CT volumes range from 0 to about 2400. The intensity values of lymph nodes tend to fall within a more specific range.
At step 706, the remaining voxels of the CT volume are processed using a trained PBCT. As described above, the PBCT is trained based on training data including annotated lymph node voxels. The PBCT can include cascade nodes and tree nodes. Each node in the PBCT classifies all of the voxels received at the node as positive or negative. If a node is a cascade node the positively classified voxels are further classified at a child node, and the negatively classified voxels are discarded. If a node is a tree node, one child node further classifies positively classified voxels and another child node further classifies negatively classified voxels. Accordingly, the voxels of the CT volume are processed through all of the nodes of the trained PBCT such that a probability of being a lymph node can be determined for each voxel (discarded voxels have a probability of 0).
The voxels positively detected as lymph nodes by the PBCT are clustered. This suggests that it is possible to predict the probability of a voxel being a lymph node based on neighboring voxels. Accordingly, the PBCT can be used along with probability prediction to determine a probability of a voxel being a lymph node. First, the trained PBCT based detector can be used to scan across a CT volume with the pace along each axis set to be 2 so that every other voxel along each axis is scanned to determine the probability of being a lymph node. Therefore, the detector will run on ⅛ of the volume voxels in this stage. Then the probabilities of the rest of the voxels can be predicted using tri-linear interpolation. If the predicted probability of a voxel is not large enough, it will be skipped without further processing. The predicted probability can be quite close to the probability calculated using the PBCT. Based on experiments to check the prediction error, the average error is μe=0.082 with the standard deviation σe0.014. Therefore, only if a voxel's predicted probability Pe satisfies pe>Tp−0.122 (μe+σe*3=0.122), where Tp is the detection threshold, the probability for the voxel would be calculated using the trained PBCT. Otherwise, the voxel is discards, because the probability that it is calculated probability is greater than Tp is less than 0.03, i.e., P{pe>Tp}<0.03, assuming that P{Pe} obeys a Gaussian distribution. In this manner, it is possible to use the PBCT along with interpolation based probability prediction to reduce detection time and reduce the false positive rate.
The above-described methods for training a PBCT and object detection using a PBCT may be implemented on a computer using well-known computer processors, memory units, storage devices, computer software, and other components. A high level block diagram of such a computer is illustrated in
The foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention. Those skilled in the art could implement various other feature combinations without departing from the scope and spirit of the invention.
This application claims the benefit of U.S. Provisional Application No. 60/826,246, filed Sep. 20, 2006, the disclosure of which is herein incorporated by reference.
| Number | Date | Country | |
|---|---|---|---|
| 60826246 | Sep 2006 | US |