Different imaging modalities capture different kinds of information about the structures being imaged. For example, infrared (IR) images of the retina and other multilayer structures contain information not available from visible light images because IR signals penetrate the multiple layers of the retina better than visible light. This is illustrated in
However, because IR light can penetrate and reflect from multiple layers of a structure, IR images are hard to read with the human eye. IR images present two principal challenges. First, IR images are often of lower spatial resolution than visible light images and consequently are more muddied and less sharp. Second, the intensity at each pixel or pixel region consists of information from multiple layers of the structure (e.g., the retinal pigment epithelium and choroidal areas of the retina), reducing contrast for subretinal features. A clinician using existing imaging and analysis systems cannot distinguish the information arising from each different layer of the structure, and thus many of the advantages of multilayer imaging have not been realized.
The difficulty of extracting information from IR images has impeded the adoption of IR imaging, particularly in retinal fundus imaging. In current practice, a clinician interested in the surface and deeper tissue of a patient's eye will dilate the patient's eye prior to visible light imaging, and/or contrast the patient's blood, in order to detect and locate retinal/choroidal structures. However, these are invasive and uncomfortable procedures for the patient. Existing IR imaging systems, such as scanning laser ophthalmoscopes, are expensive, fragile, difficult to transport and require significant operator training.
Described herein are systems, methods and non-transitory computer readable media for multilayer imaging and retinal injury analysis. These methods are preferably implemented by a computer or other appropriately configured processing device.
In a first aspect, a computer receives a first image of an eye of a subject, the first image including at least one infrared image or near-infrared image. The computer interdependently smoothes and segments the first image. Segmenting the first image may include identifying edge details within the first image. The segmenting and smoothing may include determining an edge field strength at a plurality of locations in the image, and the computer may determine the attribute based on the edge field strength. In some such implementations, the edge field strength is based on a matrix edge field.
The computer determines a value of an attribute at a plurality of locations within the smoothed, segmented first image. The attribute is indicative of at least one retinal or subretinal feature in the first image. In some implementations, the computer identifies the at least one feature based at least in part on the first attribute image. The computer may identify a boundary of the at least one feature based at least in part on the first attribute image. The at least one feature may include a lesion, and the computer may provide quantitative information about the lesion. The feature may include a zone 3 injury, such as a choroidal rupture, a macular hole, or a retinal detachment. The feature may be indicative of a traumatic brain injury. The feature may be indicative of at least one of age-related macular degeneration, juvenile macular degeneration, retinal degeneration, retinal pigment epithelium degeneration, toxic maculopathy, glaucoma, a retinal pathology and a macular pathology.
The computer generates a first attribute image based at least in part on the determined values of the attribute, and provides the first attribute image. In some implementations, the computer provides the first attribute image to a display device, and may subsequently receive a triage category for the subject from a clinician. In some implementations, the computer provides a sparse representation of the first attribute image. To provide the sparse representation, the computer may perform a compressive sensing operation. The computer may also store a plurality of features on a storage device, each feature represented by a sparse representation, and may compare the identified at least one feature to the stored plurality of features.
In some implementations, the computer also receives a second image of the eye, the second image generated using a different imaging modality than used to generate the first image of the eye. The imaging modality used to generate the second image of the eye may be visible light imaging, for example. In such an implementation, the computer combines information from the first image of the eye with information from the second image of the eye. The computer may also display information from the first image of the eye with information from the second image of the eye. In some implementations, the computer combines information from the first image of the eye with information from a stored information source.
In some implementations, the computer determines a textural property of a portion of the first image of the eye based at least in part on the first attribute image, and also compares the first image of the eye to a second image of a second eye by comparing the determined textural property of the portion of the first image of the eye to a textural property of a corresponding portion of the second image of the second eye. The first image and the second image may represent a same eye, different eyes of a same subject, or eyes of different subjects. For example, the first image and the second image may represent a same eye at two different points in time. The computer may be included in a physiological monitoring or diagnostic system, such as a disease progression tracking system, a treatment efficacy evaluation system, or a blood diffusion tracking system. In some implementations, the textural properties of the respective portions of the first and second images of the eye are represented by coefficients of a wavelet decomposition, and the computer compares the first image of the eye to the second image of the eye comprises comparing the respective coefficients for a statistically significant difference. In some implementations, the textural properties of the respective portions of the first and second images are represented by respective first and second edge intensity distributions, and the computer compares the first image of the eye to the second image of the eye comprises comparing at least one statistic of the first and second edge intensity distributions.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawings will be provided to the Office upon request and payment of the necessary fee. The above and other features of the present invention, its nature and various advantages will be more apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings in which:
To provide an overall understanding of the invention, certain illustrative embodiments will now be described, including systems and methods for computing and scoring the complexity of a vehicle trip using geo-spatial information. However, it will be understood by one of ordinary skill in the art that the systems and methods described herein may be adapted and modified as is appropriate for the application being addressed and that the systems and methods described herein may be employed in other suitable applications, and that such other additions and modifications will not depart from the scope thereof.
The system 300 includes at least one image capture device 310 for capturing images of a scene. Exemplary image capture devices 310 include visible light cameras and video recorders, images captured by various scanning laser ophthalmoscopes with or without dye in all wavelengths, optical coherence tomography, PET, SPECT, MRI, X-ray, CT scanners and other medical imaging apparatus; bright field, phase contrast, atomic force and scanning electron microscopes; satellite radar; thermographic cameras; seismographs; and sonar and electromagnetic wave detectors. Each of the image capturing devices 310 may produce analog or digital images. The image captured by a single image capturing device 310 may be scalar-, vector- or matrix-valued and may vary as a function of time. An imaged scene can include any physical object, collection of physical objects or physical phenomena of interest for which measurements of at least one property can be obtained by an image capturing device. For example, the embryonic environment of a fetus is a scene that can be measured with an ultrasound image capture device. In another example, the position and movement of atmospheric moisture is a scene that can be measured with a satellite radar image capture device.
An image database 320 is used to store the images captured by the image capturing devices 310 as a set of image data. Image database 320 may comprise an appropriate combination of magnetic, optical and/or semiconductor memory, and may include, for example, RAM, ROM, flash drive, an optical disc such as a compact disc and/or a hard disk or drive. One skilled in the art will recognize a number of suitable implementations for image database 320 within system 300, with exemplary implementations including a database server designed to communicate with processor 330, a local storage unit or removable computer-readable media.
Information extraction processor 330 and database 320 may be embedded within the same physical unit or housed as separate devices connected to each other by a communication medium, such as a USB port, serial port cable, a coaxial cable, a Ethernet type cable, a telephone line, a radio frequency transceiver or other similar wireless or wired medium or combination of the foregoing. Information extraction processor 330 queries database 320 to obtain a non-empty subset of the set of image data. Information extraction processor 330 also performs the information extraction processes described below. Exemplary implementations of information extraction processor 330 include the software-programmable processors found in general purpose computers, as well as specialized processor units that can be embedded within a larger apparatus. Information extraction processor 330 performs the method described herein by executing instructions stored on a computer-readable medium; one of ordinary skill in the art will recognize that such media include, without limitation, solid-state, magnetic, holographic, magneto-optical and optical memory units. Additional implementations of information extraction processor 330 and the remaining elements of
At the completion of the information extraction method, or concurrently with the method, information extraction processor 330 outputs a collection of processed data. Display 340 presents the processed data visually to a user; exemplary implementations include a computer monitor or other electronic screen, a physical print-out produced by an electronic printer in communication with information extraction processor 330, or a three-dimensional projection or model. Results database 350 is a data storage device in which the processed data is stored for further analysis. Exemplary implementations include the architectures and devices described for image database 320, as well as others known to those skilled in the art. Classification processor 360 is a data processing device that may optionally extract the processed data from database 350 in order to classify the processed data, i.e. identify the meaning and content of elements in the imaged scene, and may be embodied by the architectures and devices described for information extraction processor 330.
Although the system components 310-360 are depicted in
Returning to
As described above, whole-field, IR fundus illumination may reveal significantly more detail about retinal, subretinal and choroidal anatomy than visible light imaging, but employing a long-wavelength light source and illuminating the entire fundus may yield suboptimal contrast and resolution. IR fundus images are a composite of light that is reflected or back-scattered and light that is absorbed by ocular pigment, choroid and hemoglobin. However, when the entire fundus is illuminated, the contrast of any object-of-interest is degraded by light that is scattered by superficial, deep and lateral structures. Light is therefore multiply-scattered and contrast is degraded making edge detection extremely difficult. Conventional IR image analysis systems extract only a small fraction of the clinically-relevant information that is embedded in IR fundus images of the eye. There is a need for systems and techniques for parsing and analyzing the multilayer information of IR and other multilayer imaging modalities. Disclosed herein are systems and techniques for:
This disclosure presents a number of systems and techniques that can be applied in imaging applications in which information is to be captured from different layers of a multilayer structure. Some of the information extraction and analysis techniques described herein use one or more original images of a structure to generate one or more output images that provide information about user-specified attributes of the structure (for example, high frequency details of the structure, presented at different spatial scales and intensities). As described above, the multilayer information in the output images is not visible in the original images. The systems and techniques described herein enable fine, high-frequency details to be extracted from IR and near-IR images. The systems and techniques described herein are capable of extracting information from multiple depths in a structure by analyzing images taken captured using multiple different wavelengths of light or a single image captured using multiple different wavelengths of light (followed by post-processing to extract the features distinguished by the multiple wavelengths), as well as other combinations of different imaging modalities.
In some implementations of the systems and techniques described herein, the output images described above are further processed to identify particular features; for example, features indicative of lesions in retinal or subretinal layers. The improvements in retinal and subretinal feature identification achieved by the systems and methods described herein enable the effective triage and treatment of eye injuries. Effective triage may be especially important in military settings. For example, a study of soldiers evacuated from Operation Iraqi Freedom and Operation Enduring Freedom with eye injuries was performed from March 2003 through December 2004 (Aslam and Griffiths, “Eye casualties during Operation Telic,” J. R. Army Med. Corps., March 2005, pp. 34-36). Data came from the Military Office of the Surgeon General (OTSG) Patient Tracking Database. A total of 368 patients (451 eyes) were evacuated for eye-related problems: 15.8% (258 of 1,635 patients, 309 eyes) of all medical evacuations were a result of battle eye injuries (BI), 17.3% (283 of 1,635 patients, 337 eyes) were a result of eye injuries (BI and non battle injuries [NBI] combined), and 22.5% (368 of 1,635 patients, 451 eyes) of all evacuations were at least partly due to eye-related complaints. Even worse, many incipient and subtle lesions and injuries are not identified due to lack of facilities. If undiagnosed and untreated for even a few hours, the likelihood of permanent retina damage increases. The ability to analyze fundus images on site in real time may enable effective triage and selection for transport to regions where eye specialists are available, thereby improving the efficiency of current ocular telemedicine systems and the standard of care for soldiers.
In some implementations, lesions indicative of traumatic brain injuries (TBI) are automatically identified. Existing TBI identification systems and techniques require expensive, difficult-to-field modalities, such as MR imaging (requires extensive testing facilities and trained clinicians), scanning laser ophthalmoscope (SLO) imaging, and ICG angiography (typically used for choroidal vasculature, but often produces highly noisy results). Subjective assessments from patient examination are also used, but tend to be unreliable. The improved techniques described herein for identifying trauma-related eye lesions allow the early discovery of the retina or choroid which may correlate strongly with concomitant TBI.
Also described herein is an automated decision support system that includes or communicates with a multilayer imaging system (e.g., an IR imaging system). This system uses the techniques described herein to image, analyze and triage a patient's condition, providing faster and improved diagnosis of TBI and other injuries. In certain implementations, this automated decision support system is embedded, along with an imaging system, in a portable device that can be used in battlefield or accident scenarios.
For clarity of description, this disclosure uses the example application of IR retinal imaging to illustrate some of the particular features and advantages of these systems and techniques. The term “IR” is used herein to refer to infrared and near-infrared wavelengths. However, these systems and techniques are not limited to retinal imaging applications nor IR wavelengths, and are readily applied to any suitable medical or non-medical multilayer imaging application at any wavelengths. Additionally, the systems and techniques described herein are applicable to other imaging modalities, such as optical coherence tomography (OCT), laser, scanning laser ophthalmoscope (SLO) and ICG angiography, among others.
Systems and methods for multilayer feature extraction of eye images are now discussed. In some applications, the components of the automated diagnostic support system 300 (
As described above with reference to
The automated decision support system 300 may also be configured to send and/or receive information between the automated decision support system 300 and a remote location (or a local, separate device). The automated decision support system 300 may use any wired or wireless communications protocol to send images and information. In certain implementations, the automated decision support system 300 manages communication between a portable automated decision support system and a base station or command center and enables communication between remotely-located clinicians in a battlefield setting. In certain implementations, the automated decision support system 300 is configured to retrieve previously-stored information about a patient under study, which can inform a clinician of a change in a patient's condition or provide additional factors for the clinician or the automated decision support system 300 to consider. The display 340 provides a way of informing a patient or care provider of the results of the imaging and analysis techniques described herein, and may include any device capable of communicating such information, including a visual monitor, an audio output, an electronic message to one or more receiving devices, or any combination of such displays.
At the step 504, the system 300 performs a smoothing and segmenting technique, which may take the form of any of the smoothing, segmenting and attribute estimation techniques described herein (including those described in Sections C-E, below). As described below, the smoothing and segmenting operations in these techniques may be interdependent and performed substantially simultaneously (e.g., in frequent alternating iterations). In some implementations, the segmenting and smoothing technique performed at step 504 involves determining a smoothed image and/or edge details (such as an edge field strength) at a plurality of locations in the image.
The edge field may be a scalar-valued edge field (e.g., as illustrated in Eq. 5), a vector-valued edge field (e.g., as illustrated in Eq. 6), a matrix-valued edge field (e.g., as illustrated in Eq. 7), or any combination thereof. In some implementations, the smoothing and segmenting technique performed at the step 504 includes adaptively adjusting at least one of a shape and orientation defining a neighborhood associated with a plurality of locations in the image. The system 300 may perform the segmenting and smoothing technique to reduce the value of an energy function associated with an error metric, as discussed in detail in Sections C-E, below. A detailed description of several particular implementations of step 504 follows.
In some implementations, each location in the image is associated with a elliptical neighborhood over which the image is smoothed and an edge field estimated, and the size, shape and orientation of the neighborhood vary from location to location. The neighborhoods of locations identified as edges (e.g., blood vessel edges) are adaptively reduced in size to limit the “blurring” of the edge by smoothing across the edge (e.g., as illustrated in
wherein g is the retinal image data output by the image capture device 310, u is the smoothed data, V is a 2×2 symmetric edge matrix field, X is the image over which the smoothing and segmenting takes place, and α, β, ρ are adjustable parameters. As described in detail in Sections C-E, the first term can be interpreted as a smoothness fidelity term that penalizes the gradient of u by I-V, so that smoothing occurs primarily based on pixels situated in the neighborhood. The second term is a data fidelity term penalizing deviations of the smoothed image data from the input image data. The scalar term G(V) penalizes edge strength, while F(VX) balances a preference for smooth edges with high-frequency features that may be present in the first image.
While any numerical technique may be used to solve Eq. 1 for any particular image and parameter values, one approach includes the use of the Euler-Lagrange equations that form the basis of the solution. For Eq. 1, the Euler-Lagrange equations are:
Additional or alternate smoothing and segmenting approaches may also be applied at the step 504, such as standard spatial filtering and denoising techniques.
At the step 506, the system 300 determines a value of an attribute at a plurality of locations within the first image. The attribute is indicative of at least one retinal or subretinal feature in the first image.
At the step 508, the system 300 generates a first attribute image based on the attribute determination of the step 506. Specifically, for each attribute of interest, an image is generated indicating the value of that attribute for each pixel in the original image. This image may serve as the attribute image generated at the step 508, or may be further processed to generate the attribute image. If N attributes are determined at the step 506, then N attribute images may be generated at the step 506. In certain implementations, less or more than N attribute images are generated, and each attribute image may include a combination of information from two or more attribute images.
Several examples of attribute images generated at the step 506 (
A second example of an attribute image generated at the step 506 (
A third example of an attribute image generated at the step 506 (
Returning to the information extraction and analysis process depicted in
At the step 512, the features extracted and models built at the step 510 are used by the automated decision support system 300 (
For example, the system 300 may be configured to detect a lesion at the step 512. In some implementations, a lesion is characterized by its grade or severity (particularly those lesions due to macular degeneration and traumatic retinal injuries) according to clinician-specified or learned criteria. In some implementations of the step 512, the system 300 indicates that a traumatic brain injury feature has been observed. As described above, traumatic brain injuries (TBIs) are detected by identifying certain retinal and subretinal features that characterize a zone 3 injury (such as a choroidal rupture, a macular hole, or a retinal detachment). In some implementations, when the system 300 identifies any features related to a potential TBI, the system 300 provides the attribute image to a clinician display and, optionally, further provide a message indicating that a potential TBI has been detected. In some implementations, the system 300 indicates the location of the TBI-related feature on the attribute image. In response, a clinician may input a triage category for the subject into the system 300 to route the subject into appropriate channels for additional medical care or follow-up.
An example of a central idiopathic macular hole, a subretinal feature that is indicative of a TBI, is illustrated in
In some implementations, the process 500 includes additional image analysis or display steps that extract and provide information derived from multi-modal imaging (for example, extracting and analyzing information from visible light and IR images). For example, at the step 502, a plurality of images may be received, which may correspond to different or similar imaging modalities. In some implementations, the smoothing and segmenting framework discussed above with reference to the step 504 is configured for fusing and analyzing information extracted from multiple imaging modalities (or from multiple wavelengths of one imaging modality, such as multiple IR wavelengths). One way in which this framework is configured for multi-modal imaging application is by including additional terms in an energy functional formulation that address the smoothing and segmenting of images from the additional modalities. The additional terms may take the same form as the terms used for the first image or may take modified forms (e.g., using a different norm to evaluate distance, using different neighborhoods, or assigning different weights to different terms). Information from multiple imaging modalities may also be fused after smoothing and segmenting. In some implementations, the attribute images are computationally fused by summing, averaging, or by using any other combination technique. Images can be presented in two or three dimensions, and fused as topographic images with images obtained from optical coherent tomography, RGB images, and other IR images including those obtained by various scanning laser ophthalmoscopes, for example. In some implementations, the smoothing and segmenting framework discussed above with reference to the step 504 is configured for fusing and analyzing information extracted from an imaging modality (such as IR imaging) and information from a stored information source (such as information generated by other eye or physiological sensors, information from the scientific literature, demographic information, and previous images of the eye under study, for example).
In some implementations, the attribute images are displayed concurrently for clinicians to visually analyze and compare. For example,
Furthermore, the systems and techniques disclosed herein can take into account additional medical information, imaging, laboratory tests, or other diagnostic information. In particular, the systems and techniques disclosed herein can detect changes across multiple images taken at different points in time. For example, in some implementations, the system 300 retrieves, from the image database 320, multiple images of the same subject taken at different times, and compare these images to detect changes in retinal or subretinal structure.
At the step 1102, the system 300 determines a textural property of a portion of a first image of the eye based at least in part on the first attribute image. A textural property is a value of an attribute or a quantity derived from the value or values of one or more attributes. A textural property may be defined pixel-by-pixel in an image, or may be defined over a region within the first image (e.g., within the boundaries of an identified retinal or subretinal feature). In some implementations, the textural property is the value of the smoothed image generated by the system 300 at the step 504 of
At the step 1104, the system 300 compares the first image of the eye to a second image of the eye by comparing the textural property of the portion of the first image of the eye (as determined at the step 1102) to a corresponding textural property of a corresponding portion of the second image of the eye. In some implementations, the system 300 compares the first and second images by performing a statistical change analysis, such as a t-statistic, to identify whether there is a statistically significant difference between the first and second images. The confidence intervals and other parameters for the statistical change analysis are selected according to the application. In implementations in which the first attribute image is decomposed into components at the step 1102, the system 300 compares the textural properties of the first and second images by comparing the respective decomposition coefficients to determine whether a statistically significant difference exists. Although the process 1100 is described with reference to multiple images of the same eye (e.g., taken at different times), the system 300 may be configured to implement the process 1100 to analyze images of different eyes of a same subject or images of eyes of different subjects, for example.
At the step 1106, the system 300 provides the result of the change analysis (e.g., to another support or analysis system, or to a clinician). In some implementations, the change analysis identifies one or more portions of the first and second images which are different between the images and these portions are identified graphically or textually at the step 1106. In some implementations, the change analysis determines whether a statistically significant difference exists, and notifies a clinician of the presence of a difference to prompt further investigation. This time-based analysis can identify subtle changes in retinal or subretinal structure which may reflect any of a number of conditions, such as inadequate tissue perfusion and the formation or growth of lesions. The system 300, when configured to execute the process 1100, may be included in any of a number of medical diagnostic systems, including a disease progression tracking system, a treatment efficacy evaluation system, and a blood diffusion tracking system.
Compared to previous image analysis systems, the systems and techniques disclosed herein for image analysis exhibit an improved ability to detect fine features within images in the presence of noise. This detection capability is illustrated in
SNR=10 log(I/σ) (4)
where I is the intensity of the white line added, and σ is the noise standard deviation. After undergoing smoothing and segmenting as described above with reference to step 504 of
In some implementations, an IR image or a feature within an IR image, processed according to the process 500 of
As indicated in Sections A and B, the information extraction processor 330 of
The smoothing process 1540 generates a set of smoothed data 1570 from the image data. Smoothed data 1570 represents the most accurate estimate of the true characteristics of the imaged scene. Images are often corrupted by noise and by distortions from the imaging equipment, and consequently, the image data is never a perfect representation of the true scene. When performing smoothing 1540, the processor 330 takes into account, among other factors, the image data, physical models of the imaged scene, characteristics of the noise arising at all points between the imaged scene and the database 320, as well as the results of the segmenting process 1550 and attribute estimation process 1560.
The segmenting process 1550 demarcates distinct elements within the imaged scene by drawing edges that distinguish one element from another. For example, in some implementations, the segmenting process distinguishes between an object and its background, several objects that overlap within the imaged scene, or regions within an imaged scene that exhibit different attributes. The segmenting process results in a set of edges that define the segments 1580. These edges may be scalar, vector, or matrix-valued, or may represent other data types. When performing segmenting 1550, the information extraction processor 330 takes into account, among other factors, the image data 1510, physical models of the imaged scene, characteristics of the noise arising at all points between the imaged scene and the image database 320, as well as the results of the smoothing process 1540 and attribute estimation process 1560.
The attribute estimation process 1560 identifies properties of the elements in the imaged scene. An attribute is any property of an object about which the image data contains some information. The set of available attributes depends upon the imaging modalities represented within the image data. For example, a thermographic camera generates images from infrared radiation; these images contain information about the temperature of objects in the imaged scene. Additional examples of attributes include texture, radioactivity, moisture content, color, and material composition, among many others. For example, the surface of a pineapple may be identified by the processor as having a texture (the attribute) that is rough (a value of the attribute). In one implementation, the attribute of interest is the parameter underlying a parameterized family of models that describe the image data. In another implementation, the attribute of interest is the parametric model itself. When performing attribute estimation, the information extraction processor 330 takes into account, among other factors, the image data 1510, physical models of the imaged scene, characteristics of the noise arising at all points between the imaged scene and the image database 320, as well as the results of the smoothing process 1540 and segmenting process 1550.
In some implementations, when more than one image is represented in the image data, the information extraction processor 330 determines, for a particular attribute, the relative amounts of information contained in each image. When estimating this attribute, the information extraction processor 330 utilizes each image according to its information content regarding the attribute. For example, multi-spectral imaging returns multiple images, each of which was produced by a camera operating in particular wavelength bands. Different attributes may be better represented in one frequency band than another. For example, satellites use the 450-520 nm wavelength range to image deep water, but the 1550-1750 nm wavelength range to image ground vegetation. Additionally, in some implementations, the information extraction processor 330 uses statistics of the image data to identify images of particular relevance to an attribute of interest. For example, one or more different weighted combinations of image data may be identified as having more information content as compared to other combinations for any particular attribute. The techniques disclosed herein allow the attribute estimation process, interdependently with the smoothing and segmenting processes, to preferentially utilize data from different images.
Additionally, in some implementations, the information extraction processor 330 preferentially utilizes data in different ways at different locations in the imaged scene for any of the smoothing, segmenting and attribute estimation processes. For example, if each image in a data set corresponds to a photograph of a person taken at a different angle, only a subset of those images will contain information about the person's facial features. Therefore, these images will be preferentially used by information extraction processor 330 to extract information about the facial region in the imaged scene. The information extraction method presented herein is capable of preferentially utilizing the image data to resolve elements in the imaged scene at different locations, interdependently with the smoothing, segmenting and attribute estimation processes.
It is important to note that the number of attributes of interest and the number of images available can be independent. For example, several attributes can be estimated within a single image, or multiple images may be combined to estimate a single attribute.
When producing a set of smoothed data 1570 from noisy images, or classifying segments according to their attribute values, it is desirable to be able to distinguish which locations within the imaged scene correspond to edges and which do not. When an edge is identified, the information extraction processor 330 can then treat locations on either side of that edge and on the edge itself separately, improving smoothing and classification performance. It is desirable, then, to use local information preferentially during the smoothing, segmenting and attribute estimation processes. Thus, in one implementation, decisions are made at each location based on a neighborhood of surrounding locations in an adaptive neighborhood adjustment process 1565. One implementation associates a neighborhood with each particular location in an imaged scene. Each neighborhood includes a number of other locations near the particular location. Information extraction processor 330 can then use the neighborhood of each location to focus the smoothing, segmenting and attribute estimation processes 1540-1560 to more appropriately extract information about the location. In its simplest form, the neighborhoods associated with each location could have a fixed size, shape and orientation, e.g. a circle with a fixed radius. However, using an inflexible neighborhood size and shape has a number of drawbacks. For example, if a location is located on an edge, then the smoothing and attribute estimation processes that rely on the fixed neighborhood will use information from the scene elements on either side of the edge, leading to spurious results. One improvement is adjusting the size of the neighborhood of each location based on local information. A further improvement comprises adjusting the size, shape and orientation of the neighborhood of a location to better match the local characteristics in an adaptive neighborhood adjustment process 1565. These examples will be described in greater detail below.
In one implementation, information extraction processor 330 performs the information extraction method while adjusting the size, shape and orientation characteristics of neighborhoods surrounding locations in the imaged scene. In particular, the processor 330 adapts the characteristics of the neighborhoods associated with each location interdependently with the smoothing, segmenting and attribute estimation processes 1540-1560. In another implementation, the information extraction processor 330 utilizes separate independently adapted neighborhoods for each attributed analyzed by the information extraction processor 330.
The benefits of using adaptive neighborhood size, shape and orientation can be seen in
To demonstrate the improvement that such adaptation can provide, consider an exemplary implementation of the information extraction method which includes an averaging step within the smoothing process 1540 to reduce noise present in the raw image data. The averaging step produces a smoothed data value at each location (with an associated neighborhood) by replacing the image data value at that location with the average of the image data values at each of the locations that fall within the associated neighborhood.
With reference to
wherein g is the image data, u is the smoothed data, α, β are adjustable parameters and the integral is taken over all locations X in region R.
In
wherein g is the image data, u is the smoothed data, v is the edge values and α, β, ρ are adjustable parameters. A method related to that illustrated in
In
wherein g is the image data, u is the smoothed data; V is a symmetric, positive-definite 2×2 matrix representing the neighborhood; w weights the data fidelity terms; F and G are functions, and α, β, ρ, ρw are adjustable parameters. The information extraction processor 330 can also use information arising from the smoothing and attribute estimation processes 150-160 to adjust the size, shape and orientation of neighborhoods.
One particular implementation of the information extraction method is illustrated in
In one implementation of this implementation, the determination of the energy value is calculated in accordance with the following expression:
where e1, e2, e3, e4, e5 are error terms as described below. Values for the smoothed data u, the edges of the segments νu, attribute θ and the edges of the attribute segments νθ, are chosen for each (x, y) coordinate in order to minimize the expression contained in square brackets, integrated over the entire plane. This expression relies on the image data g, a data function T(θ) with attribute θ, and parameters λu, αu, ρu, λθ, αθ, ρθ, where
The term e1 is a penalty for a mismatch between the image data and the smoothed data, the term e2 is a penalty for discontinuity in the smoothed data, the term e3 includes penalties for the presence of an edge and the discontinuity of the edge, the term e4 is a penalty for discontinuity in the attribute estimate and the term e5 includes penalties for the presence of an attribute edge and the discontinuity of the attribute edge. One skilled in the art will recognize that there are many additional penalties that could be included in the energy function, and that the choice of appropriate penalties depends upon the application at hand. Equivalently, this problem could be expressed as the maximization of a reward function, in which different reward terms correspond to different desirable performance requirements for the information extraction method. There are many standard numerical techniques that could be readily applied to this specific mathematical formulation by one skilled in the art: for example, gradient descent methods. These techniques could be implemented in any of the implementations described herein.
In another implementation, the calculation of the minimum energy value is performed in accordance with the following expression:
where e1, e2, e3, e4, e5 are error terms as described below. Values for the smoothed data u, the edges of the segments w, the edge field of the measurement model parameters νm, the edge field of the process model parameters νu, the edge field of the measurement model parameters νm, the edge field of the process parameter correlations νu, the process model parameters θu, and the measurement model parameters θm are chosen for each (x1, x2, . . . , xN, t) coordinate in order to minimize the expression contained in square brackets, integrated over the entire N-dimensional image data space augmented with a one-dimensional time variable. The error terms are given by
e
1
=βM(u,g,w,θm),
e
2=αmLm(θm,νm)
e
3=αuCu(u,νu,θu),
e
4=αcLc(νc,θu), and
e
5=π(u,w,νm,νu,νc,θu,θm)
where M is a function that measures data fidelity, Lm estimates measurement model parameters, Cu measures process model spatial correlation, Lc estimates process model parameters, π represents prior distributions of the unknown variables and β, αm, αu, αc are parameters that allow the process to place different emphasis on the terms e1, e2, e3, e4.
Additional image processing techniques may also be used with the smoothing, segmenting and attribute determination techniques described herein. For example, as discussed above, the image analysis techniques described herein can identify attributes of an image, such as texture. The Matrix Edge Onion Peel (MEOP) methodology may be used to identify features on the basis of their texture. In some embodiments, where textural regions are sufficiently large, a texture wavelet analysis algorithm may be used, but combined with an MEOP algorithm for textural regions of small size. This methodology is described in Desai et al., “Noise Adaptive Matrix Edge Field Analysis of Small Sized Heterogeneous Onion Layered Textures for Characterizing Human Embryonic Stem Cell Nuclei,” ISBI 2009, pp. 1386-1389, incorporated by reference in its entirety herein. An energy functional approach may be used for simultaneous smoothing and segmentation. The methodology includes two features: a matrix edge field, and adaptive weighting of the measurements relative to the smoothing process model. The matrix edge function adaptively and implicitly modulates the shape, size, and orientation of smoothing neighborhoods over different regions of the texture. It thus provides directional information on the texture that is not available in the more conventional scalar edge field based approaches. The adaptive measurement weighting varies the weighting between the measurements at each pixel.
In some embodiments, nonparametric methods for identifying retinal abnormalities may be used. These methods may be based on combining level set methods, multiresolution wavelet analysis, and non-parametric estimation of the density functions of the wavelet coefficients from the decomposition. Additionally, to deal with small size textures where the largest inscribed rectangular window may not contain a sufficient number of pixels for multiresolution analysis, the system 300 may be configured to perform adjustable windowing to enable the multiresolution analysis of elongated and irregularly shaped nuclei. In some exemplary embodiments, the adjustable windowing approach combined with non-parametric density models yields better classification for cases where parametric density modeling of wavelet coefficients may not applicable.
Such methods also allow for multiscale qualitative monitoring of images over time, at multiple spatiotemporal resolutions. Statistical multiresolution wavelet texture analysis has been shown to be effective when combined with a parametric statistical model, the generalized Gaussian density (GGD), used to represent the wavelet coefficients in the detail subbands. Parametric statistical multiresolution wavelet analysis as previously implemented, however, has limitations: 1) it requires a user to manually select rectangular, texturally homogeneous regions of sufficient size to enable texture analysis, and 2) it assumes the distribution of coefficients is symmetric, unimodal, and unbiased, which may be untrue for some textures. As described above, in some applications, the Matrix Edge Onion Peel algorithm may be used for small size irregularly shaped structures that exhibit “onion layer” textural variation (i.e., texture characteristics that change as a function of the radius from the center of the structure).
In some embodiments, an algorithm may be used to automatically segment features, and an adjustable windowing method may be used in order to maximize the number of coefficients available from the multiresolution decomposition of a small, irregularly shaped (i.e. non rectangular) region. These steps enable the automatic analysis of images with multiple features, eliminating the need for a human to manually select windows in order to perform texture analysis. Finally, a non-parametric statistical analysis may be applied to cases where the parametric GGD model is inapplicable. This analysis may yield superior performance over the parametric model in cases where the latter is not applicable.
A number of additional image processing techniques are suitable for use in the imaging systems and methods disclosed herein, including wavelet-based texture models, adaptive windowing for coefficient extraction, PDF and textural dissimilarity estimation, and density models such as the generalized Gaussian and symmetric alpha-stable, and KLD estimators such as the Ahmad-Linand or the Loftsgaarden-Quesenberry.
In some embodiments, more than one of the techniques described herein may be used in combination, for example, in parallel, in series, or fused using nonlinear classifiers such as support vector machines or probabilistic methods. Using multiple techniques for each retinal or subretinal feature may improve accuracy without substantially compromising speed.
The invention may be embodied in other specific forms without departing form the spirit or essential characteristics thereof. The forgoing embodiments are therefore to be considered in all respects illustrative, rather than limiting of the invention.
This application claims the benefit of U.S. Provisional Application No. 61/403,380, “Systems and Methods for Multilayer Imaging and Retinal Injury Analysis,” filed Sep. 15, 2010, and incorporated by reference herein in its entirety.
Work described herein was funded, in whole or in part, by Grant No. RO1-EB006161-01A4 from the National Institutes of Health (NIH/NIBIB). The United States Government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
61403380 | Sep 2010 | US |