The invention relates generally to colonoscopy procedures and apparatus. In particular, the invention is a method and apparatus for tracking and evaluating a colonoscopy procedure and for providing a display representative of the visualization and evaluation in real time during the procedure.
Colonoscopy is the most prevalent screening tool for colorectal cancer. Its effectiveness, however, is subject to the degree to which the entire colon is visualized during an exam. There are several factors that may contribute to incomplete viewing of the entire colonic wall. These include particulate matter in the colon, subject discomfort/motion, physician attention, the speed at which the endoscope is withdrawn, and complex colonic morphology. There is, therefore, a continuing need for methods and apparatus for enhancing the visualization of the colon during colonoscopy.
The invention is a system for evaluating a colonoscopy procedure performed using an endoscope. One embodiment of the invention includes a tracking input, a video input, a processor and a display output. The tracking input receives position data representative of the location and/or orientation of the endoscope within the patient's colon during the procedure. The video input receives video data from the endoscope during the procedure. The processor is coupled to the tracking input and video input, and generates visualization metrics as a function of the video data and evaluation display information representative of the visualization metrics at associated colon locations as a function of the visualization metrics and the position data. The display output is coupled to the processor to output the evaluation display information.
Enhanced colonoscopy in accordance with one embodiment of the invention includes the combination of magnetic or other tracking technology, video data from the colonoscope, and signal processing software. The use of enhanced colonoscopy identifies regions of the colon that may have been missed or inadequately viewed during an exam. The addition of data from a preceding CT colography scan (if one was performed) is incorporated in other embodiments, and would provide additional benefit when available. Any pre-acquired data can be used for this purpose, including CT, MR or Nuclear Medicine scan to provide structural information (e.g., the shape of the colon) or functional information (e.g., potential lesions). The software would use the CT colography data to inform the colonoscopist when the endoscope is approaching a lesion identified on CT colography. However, since CT colography increases costs and limits this enhancement procedure to fewer clinical sites and cases, the system will guide the endoscopist to achieve nearly 100% viewing of the colon without the requirement for a CT scan prior to the procedure. The invention can be integrated into existing colonoscopy systems from multiple manufacturers or implemented as a stand-alone system.
During the procedure, a tracked scope is connected to the colonoscope computer as well as to an external computer system which collects the tracking and video data.
The endoscopist conducts the colonoscopy in a routine manner using the standard LCD TV 36. The guidance system 20 can record and process both the scope position and video data and generate a visualization which will approximately represent the colon in 3D and provide feedback about regions of the colon which have been missed or poorly viewed. The display can be generated in real time or otherwise sufficiently fast to enable the endoscopist to utilize the information from the display without disturbing normal examination routine. Other display approaches that provide the visualization information described herein can be used in other embodiments of the invention.
There are several technical components in this approach which can coordinate the tracker data and video data. These include (1) reconstructing the colon centerline and endoluminal surface, (2) mapping video data properties to the reconstructed colon, (3) evaluating the quality of the video data stream, and (4) presenting the data in a manner which can guide the endoscopist to examine missing or poorly viewed regions of the colon.
Each processing component of the described embodiment uses a common notation described below:
Ft video frame acquired at time t
IMt a vector of image metrics (1, 2, . . . , N) for frame Ft
reft a sampled 3D position (x, y, z) from the ref. patch at time t
scopet a sampled 3D position (x, y, z) from scope at time t—
Pt a patient-corrected position of the scope computed from scopet and reft
{P} the ordered point collection following filtering
{
{C} the ordered set of all points in the centerline
M the ordered set of verticies and corresponding edges in the 3D colon model
{
During acquisition, three coordinated signals are acquired—the video frame (Ft), the position of the scope tip (scopet), and the position of the reference patch (reft) located on the patient's back. The patient tracker position is subtracted from the endoscope tracker position to yield a patient-corrected position of the scope, Pt. This ensures that any gross patient motion is not characterized as endoscope motion. Since the magnetic reference is attached to the table, table motion is not a problem because its position relative to the magnetic reference is fixed. Processing begins when there are a predetermined number of points collected in the set ({P}) which can range from a small number of points to the entire path traversed by the scope. Other embodiments (not shown) making use of multiple tracker points acquired at a single time point (e.g., from multiple sensors or an imaging method such as fluoroscopy) can use a similar methodology. In embodiments such as these the subscript “t” can be replaced by the subscript “n” referring to an ordered sample of points collected at one time rather than across time.
The set of patient-corrected scope position points may require filtering to reduce noise depending on the quality of the tracked data. Both linear and non-linear filtering methods can be used alone or in combination depending on the type of noise present.
Linear filtering can be used to uniformly remove high frequency noise (such as system noise from the tracker). A moving average filter of size N may be implemented as:
Non-linear filtering can be used to remove spurious noise from the data in which single samples are well-outside of specification. For example,
The purpose of reconstruction is to use the collected points to generate an approximate model of the colon based on the position of the scope during an exam. This is illustrated in
When using a pre-defined centerline, the centerline, {C}, can be approximated from the sampled scope positional data. There are several approaches for generating a centerline including:
One-to-One Mapping of {
Spline-fitting: Splines may be used to reduce the number of points in {
Statistical centerline calculation: In this approach, the center-line is calculated from a statistical volume created from {
The resulting volume provides a likelihood map of the location of the interior of the colon. The map can be thresholded to generate a mask of where the scope has traveled, defining the interior of the colon. A shortest path method can be used to generate the centerline from the mask.
Once the centerline is created, a model can be generated, for example, by extruding a primative shape along the points in {C}. In one implementation of this model, the primative is defined as a discrete set of ordered points at a fixed radius (r) which describe a circle
{circle}={(x,y):x=r·cos(0 . . . 2π),y=r·sin(0 . . . 2π)}
and the extruded model is
M={C
t
:C
t
=T·circle
where T is the transformation matrix defined by the (Ct−Ct-1)}
When using a pre-defined model of the colon, the model of the colon can be fit to the tracking data. The pre-defined model is deformed to fit the tracker data. To account for soft tissue deformation, the virtual model can be “pliable” in the virtual sense such that it can be stretched or twisted to fit the tracker data. Either a patient-specific virtual model or a generic anatomic virtual model can be used to register the tracker data. This fitting task would initialize the pre-determined model (and its corresponding centerline {C})—which can be derived from pre-existing generic data or the patient's image data—in the space of {
Using landmark fitting, anatomical landmarks (or specific regions of the colon) such as the appendiceal orifice and ileocecal valve in the cecum, the hepatic flexure, the triangular appearance of the trans-verse colon, the splenic flexure, and the anal verge at the lower border of the rectum can be used to align specific points (
Using surface fitting, the pre-determined model can be deformed (with or without constraints) such that it maximizes the number of
Following reconstruction, the model (M) and corresponding centerline ({C}) are used for mapping the original points {P} into the model.
Alternatively or in addition, the tracker data can be used to compute an approximation of the centerline of the colon. After the computed centerline is generated, a generic surface can be created with a circular cross section having a fixed radius. While these approaches may not specifically reconstruct the exact true geometry of the colon, the true surface geometry is not required for guiding the procedure in accordance with the invention.
Any of a number of image quality metrics (represented as vector IMt) can be determined from the video data. These include intensity, sharpness, color, texture, shape, reflections, graininess, speckle, etc. To realize real-time processing with the system, metrics can be approximated or sparsely sampled for computational efficiency. Intensity, for example, may serve as a useful metric of quality—darker regional intensity is a lower quality region whereas higher regional intensity is better image data. Regional sharpness, calculated as
can be used to determine the quality of the image data—higher sharpness is less blurry data. In
Analysis of regions of interest (ROIs) can be used to further refine the image classification of quality analysis. For example, each video image can also be partitioned into nine regions a-i as shown in
The fusion of the model, original data, and results of the video data constitute the parametric mapping component. In preparation for mapping the video data onto the virtual model, the tracker data is normalized to the centerline of the colon to generate “standard views” from the scope. The benefit is that if the same section is viewed multiple times from different angles, the corresponding “standard view” will be the same.
The patient tracker position can be subtracted from the endoscope tracker position to ensure that any gross patient motion is not characterized as endoscope motion. Since the magnetic reference is attached to the table, table motion is effectively eliminated because the table position relative to the magnetic reference will not change. Each endoscope tracker point can be mapped to the pre-defined centerline by determining the closest centerline point to the vector defined by the tracker data. Accordingly, if the endoscope doesn't move, but looks to sides such as left or right, then all the acquired video frames will be associated with the same centerline point, but at different viewing angles.
The mapping is as follows in one embodiment of the invention, although other approaches can be used. Each point of the originally sampled points (
IM′t=aggregate(IMt at qt)
where the aggregate function may be an average, max, min, median, or other functions. Using a pre-defined color scale, the {IM″t} set is then used to color onto the surface of the M at each vertex.
Presentation of the processed signal and image data is primarily driven by the virtual model of the colon. The model provides an approximate, patient-specific, representation of the colon. On the surface of the colon, color patches are displayed to identify regions of high and low quality data. The patch color can vary according to a pre-defined color scale. White might be used in regions of the colon that have not been viewed at all. Red regions might suggest that only low quality images have been collected whereas green patches may show regions of high quality images (free of stool and foam, sharp images with adequate lighting and color).
In one embodiment, the system is implemented on a mobile cart which can be brought into a procedure room prior to the start of a colonoscopy. Other versions can be fully integrated into the procedure room.
In one embodiment, the software is a multi-threaded application which simultaneously acquires both the tracker data and video data in real-time. In addition to storing all of the data to disk, the data is processed in real-time and drawn to the screen. The same display is also sent to the LCD TV in the procedure room.
The invention can be performed using segmental analysis. In this embodiment, the colon will be divided into segments. These segments can include, but not be limited to, the cecum, proximal to mid ascending colon, mid ascending to hepatic flexure, hepatic flexure, proximal to mid transverse colon, mid transverse to splenic flexure, splenic flexure, proximal descending to mid descending, mid descending to proximal sigmoid, sigmoid, and rectum. Each segment can be visualized at least twice and the data images analyzed and compared to determine the degree of visualization. For example a concordance between sweeps 1 and 2 of 100% can be interpreted as to mean that 100% of the mucosa was visualized, while a lower level of concordance may indicate ever decreasing visualization rates. These data sets will be computed in real time or near-to-real time and the information provided in a variety of means, including visual and/or auditory in order to inform the proceduralist of the results and aid in decision making regarding adequate visualization of the mucosa.
Prior exam data can be incorporated into other embodiments of the invention. For example, prior examination data from two sources can be used. One source of prior data is pooled data from multiple endoscopists. This data could provide a statistical likelihood and 95% CI (confidence interval) that the mucosa in a given segment of the colon has been visualized with blur free images. Data used to provide this instrument could include examinations where mucosal surface visualized has been verified by more than one examiner, or by correlation with another technology such as CT colonography. Other relevant data that might modify the likelihood can include the speed of withdrawal, the specific anatomic segment (variable likelihood in different segments), the number of times the segment has been traversed, etc. The second source of prior data is examinations from the specific endoscopist. Endoscopist specific modifiers of the likelihood of complete mucosal visualization could include the speed of withdrawal, and perhaps even some seemingly unrelated factors like the specific endoscopist's overall polyp detection rate, etc. (i.e. some endoscopists might need more of an accuracy handicap than others).
Relevance feedback can also be incorporated into the invention. In embodiments including this feature, information provided by the computer system is tailored to be non-disruptive yet compulsive in indicating the extent and quality of visualization within a temporal and/or spatial block. This is achieved through a relevance feedback framework wherein the system gauges the efficacy of its extent/quality cues as a function of the endoscopist's subsequent response and uses this information to iteratively achieve an improved cueing subsequently.
The system provides extent/quality cues to the recently visualized segment and objectively interprets the subsequent actions of the endoscopist as to whether, and to what degree, the cues are relevant or irrelevant to the exam. The system then learns to adapt its assumed notion of quality and or coverage to that of the endoscopist. The feedback operates in both greedy and cooperative user modes. In the greedy mode, the system provides feedback for every recently visualized region. In the cooperative user mode wherein a segment is repeatedly visualized in multiple sweeps, the feedback progressively learns, unlearns and relearns its judgment.
Computational strategy for achieving relevance feedback involves “active learning” or “selective sampling” of extent/quality-sensitive features, in-order to achieve the maximal information gain, or minimized entropy/uncertainty in decision-making. Active learning provides accumulation, stratification and mapping of knowledge during examination from time to time, segment to segment, endoscopist to endoscopist and from patient to patient. Resultant mapping learned across the spectrum can potentially minimize intra-exam relevance feedback loops which might translate into an optimal examination.
An accelerometer can also be incorporated into embodiments of the invention described herein. An accelerometer embedded at or near the tip of the colonoscope, for example, will provide feedback regarding the motion of the scope. In particular, the “forward” and “backward” motion of the scope provides useful information about the action of the endoscopist. “Forward” actions (in most but not all cases) are used during insertion to feed the scope through the colon; “backward” motion (in most cases but not all) is the removal of the scope and is often associated with viewing of the colon. For the purposes of computer assisted guidance, the path of the scope path may be constructed during insertion only, whereas image analysis may occur during removal. Alternatively, multiple forward and back motions may indicate direct interrogation of folds or other motions which would confound the automated analysis; this could be determined from the accelerometer data. Additional accelerometers can be populated along the length of the scope. Using a flexible tube model, the combination of accelerometers can be used to infer some features of the shape of the scope. In particular, multiple adjacent sensors could be used to detect looping of the scope. Moreover, during insertion or pullback, the repeated capture of multiple accelerometers can be used to reconstruct the path of the entire scope. An inertial navigation system (INS)—generally a 6 DOF (degree of freedom) measurement device containing accelerometers and gyroscopes—can also provide local motion estimates and be combined with other INS devices to infer features of the entire scope including the shape of the scope.
A stereoscopic view/laser range finder can be incorporated into the invention. Reconstruction of the local 3D geometry can be achieved through several different methods. A combination of stereo views and image processing (texture/feature alignment) can be used to reconstruct the 3D geometry from a scene. Stereo optics can, for example, be incorporated into the colonscope. Alternatively, a specialty lens could be attached to the tip of a scope to achieve a stereoscopic view. This can be achieved through a lenticular lens or possibly multiple lenses which are interchangeably placed in front of the camera. A visible light filter can be swept across the scene to reconstruct the 3D surface (in a manner similar to laser surface scanners and/or laser range finders). A combination of multiple views from a tracked camera can also be used to reconstruct the interior surface of the colon. The reconstructed 3D surface can be used to detect disease such as polyps (based on curvature), evaluate normal, abnormal, and extent of folding of the colon wall, and precisely measure lesion size.
Insufflation can also be used in connection with the invention. Poor insufflation of the colon results in poor viewing of the colon wall (particularly behind folds). Automatically determining the sufficient insufflation is an important process to incorporate in the system. Using a 3D surface reconstruction system the uniformity of the colon wall can be used as a metric for proper insufflation. The extent of folds can also be estimated from the video data. Specifically, local image features such as the intensity gradient can be used to determine the shape and extent of folds within the field of view. Finding a large number of image gradients located in close proximity suggests a fold in the colon wall. Alternatively, by varying the insufflation pressure slightly, the changes in image features (such as gradients) can provide an estimate of fold locations and extent of folds.
Although the present invention has been described with reference to preferred embodiments, those skilled in the art will recognize that changes can be made in form and detail without departing from the spirit and scope of the invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US09/65536 | 11/23/2009 | WO | 00 | 6/28/2011 |
Number | Date | Country | |
---|---|---|---|
61199948 | Nov 2008 | US |