This invention generally relates to medical imaging and more specifically to image processing and computer aided diagnosis for diseases, such as colorectal cancer, using an automated image processing system providing a rapid, inexpensive analysis of video from a standard endoscope, optionally including a 3 dimensional (“3D”) reconstructed view of the organ of interest, such as a patient's colon. This invention is to be used in real time employing video data from a conventional endoscope that is already being used during an examination, such as a colonoscopy, to provide an instantaneous second opinion without substantially prolonging the examination.
Although this invention is being disclosed in connection with colorectal cancer, it is applicable to many other areas of medicine. Colorectal cancer is the second leading cause of cancer-related deaths in the United States. More than 130,000 people are diagnosed with colon cancer each year and about 55,000 people die from the disease annually. Colon cancer can be prevented and cured through early detection, so early diagnosis is of critical importance for patient survival (American Cancer Society, Cancer Facts and Figures, 2004, incorporated herein by reference). Screening for polyps using an endoscope is the current and most suitable prevention method for early detection and removal of colorectal polyps. If such polyps remain in the colon, they can possibly grow into malignant lesions (Hofstad, B., Vatn, M. H., Andersen, S. N., Huitfeldt, H. S. et al., Growth of colorectal polyps: redetection and evaluation of unresected polyps for a period of three years, Gut 39(3): 449-456. 1996, incorporated herein by reference). In the case of flat lesions, in which no protruding polyps are present, the colonic mucosal surface is granular and demarcated into small areas called nonspecific grooves. Changes in the cellular pattern (pit pattern) of the colon lining might be the very earliest sign of adenoma or tumors (Muehldorfer, S. M., Muenzenmayer, C., Mayinger, B., Faller, G. et al., Optical tissue recognition based on color texture by magnifying endoscopy in patients with Barrett's esophagus, Gastrointestinal Endoscopy 57: AB179, 2003, all of which are incorporated herein by reference). Pit patterns can be used for a qualitative detection of lesions to measure these textural alterations of the colonic mucosal surface. Though the specificity using pit patterns is low, its relatively high sensitivity can highlight suspicious regions, permitting further examination by other sensors. Texture-based pit-pattern analysis can identify tissue types and disease severity. Image-based polyp reconstruction can provide 3 dimensional shape and size of a protruding polyp, using video from an endoscope and computer vision algorithms to synthesize multiple-views that can be converted into 3 dimensional images. Optionally, several image processing algorithms and enhancement techniques can be used to improve image quality.
Various non-invasive scanning techniques have been proposed to avoid the need for a colonoscopy, but if these scanning techniques disclose the possible existence of a polyp or lesion, a colonoscopy must be performed later anyway. Further, the colonoscopist must locate the actual polyp or lesion (which was shown in the scan) in the patient's colon at a remote time after the scan, which can be very difficult. Also, scans do not provide information about the color or texture of the interior surface of the colon, which would provide diagnostic information about vessel structure and pit patterns, especially for flat lesions. It is highly desirable to avoid false positives for flat lesions because they do not project outward from the colon wall, so that they must be removed by cutting into the colon wall, thus incurring greater risks of bleeding, infection and other adverse side effects. Also, scans may not be able to differentiate between polyps or lesions and residual stool or other material in the colon.
Colonoscopies must be done efficiently in order to be economical, so the colonoscopist must rapidly scan the comparatively large area of the interior surface of the large intestine. Accordingly, there is a risk that a lesion or polyp may be overlooked.
If a lesion or polyp is found during a colonoscopy, it is unnecessary to relocate it in a later procedure because the endosclope is already at the lesion or polyp and most endoscopes are equipped with a means by which to introduce cutting instruments. Thus, a polyp or lesion can be cut out during the same colonoscopy in which it was detected, either at the time it was first detected, or at a later time during the same colonoscopy.
The skill of a colonoscopist in detecting and analyzing polyps and lesions depends on the individual colonoscopist's training and experience. Thus, to standardize detection and analysis in colonoscopies, it is desirable to provide independent expert analysis in real time during a colonoscopy to alert the colonoscopist to a potential polyp or lesion, or to confirm or question a diagnosis of any polyp or lesion that is found. It is also desirable to provide such expert knowledge in an inexpensive and readily available manner, without requiring the purchase of additional expensive hardware.
This invention is a process for providing computer aided diagnosis from video data of an organ during an examination with an endoscope, including analyzing the video data to discard poor quality image frames to provide satisfying image frames from the video data; enhancing the image frames; detecting and diagnosing any lesions in the image frames; wherein the analyzing, enhancing, and detecting and diagnosing steps are performed in real time during the examination. Optionally, the image frames can be used to reconstruct a 3 dimensional model of the organ.
As applied to colonoscopies, the invention is a process for providing computer aided diagnosis from video data of an endoscope during a colonoscopy of a large intestine, comprising: analyzing the video data to discard poor quality image frames to provide satisfying image frames from the video data; enhancing the image frames; detecting and diagnosing any polyps in the image frames; wherein the analyzing, enhancing, and detecting and diagnosing steps are all performed in real time during the colonoscopy by software operating on a computer that is operably connected to receive video data from the endoscope.
The analyzing step preferably comprises glint detection and elimination and focus analysis. The enhancing step preferably comprises contrast enhancement, super resolution and video stabilization. The detecting and diagnosing step preferably comprises color calibration, color analysis, texture analysis and feature detection. The texture analysis can include analyzing blood vessels and pit patterns. Optionally, the process also comprises: reconstructing the large intestine in three dimensions from the image frames to form a reconstructed model and recovering 3 dimensional shape and size information of any polyps in said colon from the reconstructed model. The reconstructing step preferably comprises fisheye distortion correction, geometry calibration, image based modeling and three dimensional data stitching.
1. System Framework of Automatic Image Quality Assessment
The present invention is a complex multi-sensor, multi-data and multi-algorithm image processing system. The design provides a modular and open architecture built on phenomenology (feature) based processing. The feature set includes the same features used by the colonoscopists to assess the disease severity (polyp size, pit pattern, etc.). The image-based polyp reconstruction algorithm features several steps: distortion correction, image based modeling, 3D data stitching and reconstruction. The texture-based pit-pattern analysis employs morphological operators to extract the texture pattern, and then utilizes a statistical model and machine learning algorithms to classify the disease severity according to the color and texture information of pits. By analyzing the 3D poly shape and pit-pattern the colonoscopist is provided with diagnostic information for macroscopic inspection. The open architecture also allows for a seamless integration of additional features (Maroulis, D. E., Iakovidis, D. K., Karkanis, S. A., and Karras, D. A., CoLD: A versatile detection system for colorectal lesions in endoscopy video-frames, Compute Methods Programs Biomed. 70(2): 151-166. 2003; Buchsbaum, P. E. and Morris, M. J., Method for making monolithic patterning dichroic filter detector arrays for spectroscopic imaging, Ocean Optics, Inc., U.S. Pat. No 6,638,668 Barrie, J. D., Aitchison, K. A., Rossano, G. S., and Abraham, M. H., Patterning of multilayer dielectric optical coatings for multispectral CCDs, Thin Solid Films 270: 6-9, 1995, all of which are incorporated herein by reference) from other microscopic modalities (such as OCT: Optical Coherence Tomography, FTIR: Fourier Transform Infrared and Confocal Microscopy), to provide more accurate diagnostic information. Our system also allows for visualization and virtual navigation for Colonoscopist, and this is done by virtual reality techniques and a magnetic sensor, which provides us absolute spatial location and orientation.
The system described in this invention starts from RGB (Red-Green-Blue color space) videos acquired from a digital endoscope. A series of algorithms is employed to perform the image preprocessing (Pascale, D., A Review of RGB Color Spaces, 5700 Hector Desloges, Montreal (Quebec), Canada, the Babel Color Company. 2003; Wolf, S., Color Correction Matrix for Digital Still and Video Imaging Systems, NTIA Technical Memorandum TM-04-406. 2003, all of which are incorporated herein by reference), and this is done by two modules, the first being the video quality analysis module, which aims to discard poor quality image frames and delete them from the video; the second being the video quality enhancement module, which aims to improve the image quality and reduce image artifacts. The whole framework is shown in
2. Video Quality Analysis and Enhancement
The video quality analysis and enhancement comprises a glint removal algorithm, a blur detection algorithm, a contrast enhancement algorithm, a super-resolution reconstruction algorithm and a video stabilization algorithm. The framework is shown in
2.1 Glint Removal Algorithm
We incorporate the same glint removal algorithm that we designed for cervical cancer CAD.
(Lange H.; Automatic glare removal in reflectance imagery of the uterine cervix; SPIE Medical Imaging 2005; SPIE Proc. 5747, 2005). The method is to extract a glint feature signal from the RGB image that provides a good glint to background ratio, finds the glint regions in the image, and then eliminates the glint regions by restoring the estimated image features for those regions. We have chosen the G (Green) image component as the glint feature signal, because it provides a high glint to background ratio and simplicity of calculation. Glint regions are either detected as saturated regions or small high contrasted regions. Saturated regions are detected using an adaptive thresholding method. Small high contrasted bright regions are detected using morphological top hat filters with different sizes and thresholds. The full extent of the glint regions are approximated using a morphological constraint watershed segmentation algorithm plus a constant dilation. The image features (R,G,B) are first interpolated from the surrounding regions based on Laplace's equation. Then the intensity image feature is restored by adding to the interpolated region intensity function a scaled intensity function that is based on the error function between the interpolated region intensity and the raw intensity data from the region and a signal based on the detected binary glint region. The glint detection and elimination algorithm consists of three consecutive processing steps: (1) Glint feature extraction, (2) Glint region detection, and (3) Glint region elimination and image feature reconstruction.
2.2 Blur Detection Algorithm
The blur detection algorithm utilizes a normalized image power spectrum method (Gu J. and Li W., Automatic Image Quality Assessment for Cervical Imagery, SPIE Medical Imaging 2006; SPIE Proc. 6146, 2006, incorporated herein by reference), which can be described as the following steps:
The method preferably used for contrast enhancement is adaptive histogram equalization. Adaptive histogram equalization enhances the contrast of images by transforming the values in the intensity image (Zuiderveld, K., Contrast limited adaptive histogram equalization, Princeton, N.J.: Academic Press, Graphics gems IV, ed. Heckbert, P., 1994, incorporated herein by reference). Unlike global histogram equalization, adaptive histogram equalization operates on small data regions (windows), rather than the entire image. Each window's contrast is enhanced, so that the histogram of the output region approximately matches the specified histogram. The neighboring windows are then combined using bilinear interpolation in order to eliminate artificially induced boundaries. The contrast, especially in homogeneous areas, can be limited in order to avoid amplifying the noise which might be present in the image. The results of contrast enhancement can be viewed in
2.4 Super-resolution Reconstruction
Super resolution is a technique to use multiple frames of the same object to achieve a higher resolution image (Kim, S. P., Bose, N. K., and Valensuela, H. M., Recursive reconstruction of high resolution image from noisy undersampled multiframes, IEEE Transaction on Acoustics, Speech, and Signal Processing 38(6): 1031-1027, 1990; Irani, M. and Peleg, S., Improving resolution by image registration, CVGIP:GM 53: 231-239, 1991, all of which are incorporated herein by reference). Super resolution works when the frames are shifted by fractions of a pixel from each other. The super-resolution algorithm is able to produce a larger image that contains the information in the smaller original frames, first an image sub-pixel registration is employed to establish the correspondence between several low resolution images, and then a sub-pixel interpolation algorithm is used to reconstruct the higher resolution image.
2.5 Video Stabilization
Video stabilization is the process of generating a compensated video sequence by removing image motion from the camera's undesirable shake or jiggle. The preferred video stabilization algorithm consists of a motion estimation (ME) block, a motion smooth (MS) block, and a motion correction (MC) block, as shown in
3. Three Dimensional Colon Modeling and Reconstruction
A 3D colon model is a preferred component of a computer-aided diagnosis (CAD) system in colonoscopy, to assist surgeons in visualization, and surgical planning and training. The ability to construct a 3D colon model from endoscopic videos (or images) is thus preferred in a CAD system for colonoscopy. The mathematical formulations and algorithms have been developed for modeling static, localized 3D anatomic structures within a colon that can be rendered from multiple novel view points for close scrutiny and precise dimensioning (Mori, K., Deguchi, D., Sugiyama, J., Suenaga, Y. et al., Tracking of a bronchoscope using epipolar geometry analysis and intensity-based image registration of real and virtual endoscopic images, Med. Image Anal. 6(3): 321-336. 2002; Lyon, R. and Hubel, P., Eyeing the camera: into the next century, 349-355. IS&T/TSID 10th Color Imaging Conference, Scottsdale, Ariz., 2002; Zhang, X. and Payandeh, S., Toward Application of Image Tracking in Laparoscopic Surgery, in International Conference on Pattern Recognition, Proc. of International Conference on Pattern Recognition 364-367. ICPR2000, 2000, all of which are incorporated herein by reference. This ability is useful for the scenario when a surgeon notices some abnormal tissue growth and wants a closer inspection and precise dimensioning.
The modeling system of the present invention uses only video images and follows a well-established computer-vision paradigm for image-based modeling. Prominent features are extracted from images and their correspondences are established across multiple images by continuous tracking and discrete matching. These feature correspondences are used to infer the camera's movement. The camera motion parameters allow rectifying of images into a standard stereo configuration and inferring of pixel movements (disparity) in these images. The inferred disparity is then used to recover 3D surface depth. The inferred 3D depth, together with texture information recorded in images, allow constructing of a 3D model with both structure and appearance information that can be rendered from multiple novel view points. More precisely, the modeling system comprises the following components:
Creating a three-dimensional polyp model from endoscopic videos allows accomplishment of three goals. First, the model will allow the clinician to mark areas (on the model) during the entry phase of the colonoscopic exam, and to treat these areas during withdrawal. Second, for high-risk patients that require surveillance, it provides a framework for registering the patient's clinical state across exams, thereby enabling change detection. Third, after the 3D reconstruction, the system of this invention can quantitatively calculate the physical size of the polyp, and send to the colonoscopist a statistical criterion for making diagnostic decisions.
4. Pit Pattern Analysis
Treatment decisions for flat and depressed lesions are based on a detailed examination of the macroscopic morphological appearance, including the luminal surface structure of the crypts of Lieberkühn, otherwise known as “pit patterns”. Pit patterns analysis can offer a degree of positive predictive value for both the underlying histology and depth of vertical submucosal invasion. The preferred system utilizes the Kudo tissue classification method which describes seven types of pit patterns (Kudo, S., Hirota, S., Nakajima, T., Hosobe, S., Kusaka, H., Kobayashi, T. et al., Colorectal tumours and pit pattern, Journal of Clinical Pathology. 47(10): 880-885. 1994. Kudo, S., Rubio, C. A., Teixeira, C. R., Kashida, H., and Kogure, E., Pit pattern in colorectal neoplasia: endoscopic magnifying view, Endoscopy 33(4): 367-373. 2001. Kudo, S., Tamura, S., Nakajima, T., Yamano, H., Kusaka, H., and Watanabe, H., Diagnosis of colorectal tumorous lesions by magnifying endoscopy, Gastrointestinal Endoscopy. 44(1): 8-14. 1996, all of which are incorporated herein by reference), according to histological, macroscopic morphology and size. These pit patterns have been correlated with histopathology to relate surface patterns to the underlying tissue structure. Lesions can be categorized into basic clinical groups: Kudo crypt group I/II constitutes non-neoplastic, non-invasive patterns. Group IIIL/IIIS/IV/VI represents neoplastic but non-invasive lesions. Group VN represents neoplasia with accompanying invasive characteristics. Detailed characteristics are stated as follows, and appearances and photographic examples can be seen in
The pit pattern analysis module starts from high magnification endoscopic images, and morphological operators are first preformed to extract the texture pattern (Chen, C. H., Pau, L. F., and Wang, P. S. P., Segmentation Tools In Mathematical Morphology, Handbook of Pattern Recognition and Computer Vision, in Handbook of Pattern Recognition and Computer Vision, pp. 443-456. World Scientific Publishing Co., 1989, incorporated herein by reference). A statistical model and machine learning algorithms are then utilized (Sonka, M., Image Processing Analysis and Machine Vision, in 1998, incorporated herein by reference) to classify the disease severity according to the color and texture information (Lin, H.-C., Wang, L.-L., and Yang, S.-N., Extracting periodicity of a regular texture based on autocorrelation functions, Patter Recognition Letters 18(5): 433-443, ELSVIER Science Direct. 1997; Argenti, F., Alparone, L., and Benelli, G., Fast algorithms for texture analysis using co-occurrence matrices, in Radar and Signal Processing, IEE Proceedings F, 137: 443-448. [6], 1990, all of which are incorporated herein by reference) of pits. A large number of reference images with labeled annotations are used for training purpose, and thus the test images can be clustered against these reference imabased on the likelihood to different classes (Magoulas, G. D., Plagianakos, V. P., and Vrahatis, M. N., Neural network-based colonoscopic diagnosis using on-line learning and differential evolution, Applied Soft Computing 4(4): 369-379, 2004; Liu, Y., Collins, R. T., and Tsin, Y., A computational model for periodic pattern perception based on frieze and wallpaper groups, Liu, Y., Collins, R. T., and Tsin, Y., Pattern Analysis and Machine Intelligence, IEEE Transactions on 26(3): 354-371, 2004; Liu, Y., Lazar, N., Rothfus, W. E., Dellaert, F., Moore, S., Scheider, J., and Kanade, T., Semantic Based Biomedical Image Indexing and Retrival, Trends in Advances in Content-Based Image and Video Retrival, eds. Shapiro, D., Kriegel, and Veltkamp, 2004; Zhang, J., Collins, R., and Liu, Y., Representation and matching of articulated shapes, in Computer Vision and Pattern Recognition, 2: II-342-II-349. IEEE Conference on Computer Vision and Pattern Recognition CVPR 2004, 2004, all of which are incorporated herein by reference). The training feature includes: pit size, pit shape and pit density.
While the present invention has been disclosed in connection with the best modes described herein, it should be understood that there may be other embodiments which fall within the spirit and scope of the invention, as defined by the claims. Accordingly, no limitations are to implied or inferred in this invention except as specifically and explicitly set forth in the claims.
This invention can be used whenever video data from an endoscope is available during an examination of an organ, and especially when it is necessary to diagnose potentially cancerous regions, and motion pictures of the regions are available.
This application claims priority to US provisional patent application 60/839275 filed Aug. 21, 2006, incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60839275 | Aug 2006 | US |