The present invention relates to imaging using imaging techniques such as ultrasound echo imaging, and in particular to the combination of images of a common object.
Ultrasound pulse-echo imaging involves imaging of an object using an analysis beam of ultrasound. The echo signal is detected to generate an image. An important type of ultrasound echo imaging is echocardiography. Several imaging techniques (X-ray, angiography and MRI) have proven to be useful in cardiology, but echocardiography presents unique characteristics that make it the most commonly applied imaging diagnostic technique in clinical practice.
Traditionally, two-dimensional (2D) echocardiography has been used widely as a relatively cheap, portable and real-time interactive assessment of heart function. Using a 2D imaging modality introduces disadvantages such as the difficulty to characterize three-dimensional structures or the dependence on the probe position, which makes it difficult to compare images if acquired by different clinicians or at different times.
Recently developed technology has allowed, for the first time, to acquire three-dimensional (3D) ultrasound echo images of the heart in real time. This new imaging modality opens a wide range of possibilities for echocardiography in clinical routine. However, at its present stage, due to the limited field of view of the transducer, it is not possible to scan the whole adult heart in a single acquisition (and, in some cases, not even the left ventricle). Furthermore, some cardiac structures can be appreciated only from particular acoustic windows. For this reason, a complete diagnostic study in real time 3D ultrasound (RT3DUS) will consist of more than one acquisition acquired from different positions. The development of tools to combine these acquisitions and present a single, optimal dataset to the clinician could greatly improve the clinical uses of the technology.
In general, it is known to combine ultrasound images acquired with the same or different orientation of the analysis beam relative to the image, as a way to improve image quality, mainly through speckle reduction. Usually, simple techniques are used to combine the images, such as taking the mean or the maximum intensity at each pixel. This is often referred to as compounding the images.
However, whilst such known techniques for combining images can reduce speckle and other noise, they can result in the reduction of the information content. This is because the quality of the images can vary depending on a number of factors. Examples of such factors in ultrasound echo imaging are the nature of the ultrasound transducer used, the acoustic properties of the object being imaged, the conditions during image acquisition and the alignment between the object and the ultrasound analysis beam. Variance in the image quality between images means that the known techniques can cause high quality information in one of the images to be masked by low quality information in another one of the images, resulting in a net reduction in the information content in some parts of the combined image. This affects visualisation and the quality of image analysis which can be performed, for example by a clinician in the case of an echocardiography image or other clinical image.
It would be desirable to obtain combined images in which these problems are reduced.
According to the present invention, there is provided a method of combining a plurality of images of a common object which images are in registration with each other and have been acquired using an imaging technique which uses an analysis beam and is dependent on the angle of the analysis beam, relative to the object, the method comprising:
deriving, in respect of each image, feature measures in respect of each pixel, being measures of the degree of detectability of a feature which is invariant with the local contrast of the image;
deriving, in respect of each image, alignment measures in respect of each pixel, being measures of the degree of alignment between the normal to said feature and the analysis beam;
determining, in respect of each pixel of the plurality of images, relative weights in respect of the plurality of images based on the feature measures in respect of the pixel in the plurality of images, taking into account the alignment measures in respect of the pixel in the plurality of images; and
producing a combined image by combining the corresponding pixels of each image in accordance with the determined relative weights.
This method provides a combined image in which the information content from the plurality of images is maximised. Hence the combined image is of better quality for the purpose of subsequent visualisation and image analysis, for example by a clinician. This is achieved by the use of the feature measures and the alignment measures to derive relative weights in accordance with which the images are combined. As the relative weights are determined in respect of each pixel, different images having a better definition of features may predominate in different areas of the combined image so that the overall information content is improved.
The use of a feature measure which detects a feature which is invariant with the local contrast of the image allows a proper comparison between different images which may have different contrast either globally due to some parameter of the acquisition or locally due to effects such as attenuation of the analysis beam. A particularly suitable type of feature is a phase congruency measure which provides the advantages of a phase-based analysis over an intensity-based analysis.
Use of the alignment measures allows account to be taken of the dependence of image quality on the alignment between the normal to the features and the analysis beam. This dependence arises in ultrasound echo imaging because ultrasound echoes get weaker as the alignment decreases and vice versa, but similar dependence is seen in other imaging techniques also. Thus the present method allows the combination of images taken with different views in a manner that the high quality information acquired from each view may be retained in the combined image.
Validation of the present method on both phantom ultrasound echo images and actual RT3DUS cardiac images has shown significant improvement over the known compounding techniques mentioned above. This validation is discussed further below.
The combined image may be used in a variety of manners including general tasks such as display, segmentation, tracking etc. However, the improved quality of the images facilitates their particular application to more complicated tasks such as object recognition and in image-guided intervention, for example to align images acquired during the intervention using the combined image as a reference.
To allow better understanding, an embodiment of the present invention will now be described by way of non-limitative example with reference to the accompanying drawings, in which:
a) to 2(e) are simulated images showing the application of the method of
a) to 3(e) are RT3DUS images showing the application of the method of
The embodiment of the present invention shown in
In step 1, a plurality of images 2 of a common object are acquired using the imaging technique in question. Typically the object is a part of a human body such as the heart. The acquisition may be performed using known techniques with appropriate apparatus.
The images 2 may consist of a set of images acquired in close succession, for example a set of views acquired using the same imaging apparatus to constitute a complete diagnostic study. Alternatively, the images 2 may include images acquired at different times.
The images 2 include different views, that is where the image is acquired using an analysis beam with a different alignment relative to the object being imaged. Where plural images 2 are acquired with the same view, each image may be used separately or they may be compounded using known techniques such as averaging and the resultant compound image used as one of the plurality of images 2.
Each image is represented by a set of values for respective pixels, typically being intensity values. In general, the images 2 may be 2D images or 3D images representing points in two or three spatial dimensions, or the images 2 may additionally represent points in time as an additional dimension. The method is particularly applicable to RT3DUS images. In the case of 3D images, the individual points are often referred to as voxels, but herein the term “pixel” is being used to refer to points in images of any dimensionality.
The images 2 acquired in step 1 are processed in steps 3 to 8 which are conveniently performed by a computer program executed on a computer system 10 illustrated schematically by a dotted line in
As an alternative, the computer system 10 could be a system dedicated to the present analysis, for example associated with a system used to acquire the images 2 in step 1. In this case the computer system 10 could be optimised for performing the analysis, for example by running various processes in parallel.
In step 3, the plurality of images 2 are registered with each other. Step 3 is optional in that the images 2 may already be registered with each other by a physical technique employed in the acquisition of the images 2 in step 1. If this is not the case, then in step 3, registration is achieved by processing of the images 2 themselves. This may be done by any of the large number of known registration techniques which exist, for example as disclosed in, inter alia, Rohling, Gee & Berman. “3-D spatial compounding of ultrasound images”, Medical Image Analysis, Oxford University Press, Oxford, UK, 1(3), pp. 177-193, 1997 or Xiao, Brady, Noble, Burcher & English, “Non-rigid registration of 3D free-hand ultrasound images of the breast”, IEEE Transactions on Medical Imaging 21(4), p. 404-412, 2002.
In step 4, measures of phase and orientation are calculated in respect of each pixel. This is performed using a phase-based analysis of the images. In particular, each image is transformed into a monogenic signal image in the manner described below.
The following general points may be helpful. Phase-based analysis has been proposed as an alternative to intensity-based analysis for many image processing tasks, for example as discussed in Morrone & Owens, “Feature detection from local energy”, Pattern Recognition Letters, 6:303-313, 1987. Phase provides invariance to changes in brightness and contrast within the image. This contrast-invariance property makes it particularly fit for ultrasound images, in which beam attenuation is present and echo intensity depends on the angle of incidence of the ultrasound beam.
Local phase is usually calculated by combining the output of a set of filters with different angles, but in the present method phase is derived from a monogenic signal of the type disclosed in Felsberg & Sommer, “The monogenic signal”, IEEE Transactions on Signal Processing, 49(12)-3136-3144, December 2001. The monogenic signal is an isotropic extension of the analytic signal which preserves its basic properties.
Analogous to the analytic signal for ID, the monogenic signal is built by combining the original signal with the Riesz transform, a generalization of the Hilbert transform for higher dimensional signals. The Riesz transform of an N-dimensional image is obtained, in the frequency domain, by multiplying the original image with the set of filters Hi:
where u=[u1 . . . uN]T, with ui representing the ith coordinate unit vector; there are thus as many filters as image dimensions. By way of example in the 2D case, the set of filters H1 and H2 are:
The monogenic signal is then formed by the original image and the Riesz transform, so the Riesz transform of an N-dimensional signal is formed by N+1 N-dimensional signals, i.e. an N-dimensional vector is assigned to each original point (the phase vector). The length of this vector is the local amplitude of the monogenic signal, and the orientation angles correspond to the local phase and the local structure orientation. By way of example in the 2-D case, these values are:
ƒ(x,y)=A(ƒ(x,y))cos(φ)
(h1*ƒ(x,y))=A(ƒ(x,y))sin(φ)cos(θ)
(h2*ƒ(x,y))=A(ƒ(x,y))sin(φ)sin(θ) (3)
where hi(x) is the inverse Fourier transform of Hi(u), φ and θ are the phase and local orientation angle, respectively, and A(f(x,y)) is the amplitude of the monogenic signal given by the equation:
In general, to be able to get the values localized in the frequency domain, filters H(u) are multiplied by a set of band-pass filters G(u).
Whilst the explanation given above assumes that the original image is transformed, in fact in step 4 each image is transformed into a monogenic signal in each of a plurality of spatial frequency bands. This produces a multi-scale representation. The monogenic signal in each frequency band will be denoted by a subscript s. This is because it is useful to localize features both in space and in frequency. To achieve this purpose, respective spatial band-pass filters Gs(u) are combined with the Hi(u) filter in equation (1). For example, in the above equations Hi(u) is replaced by the combined filter Hi(u).Gs(u) in respect of each frequency band or scale s. This has the same effect as first filtering the image signals into each of the spatial frequency bands with a bank of band-pass spatial frequency filters, and then transforming each spatial frequency band of each image into a monogenic signal image.
In general the spatial band-pass filters Gs(u) may be of any form, for example one of the alternatives disclosed in Boukerroui, Noble and Brady, “On the Choice of Band-Pass Quadrature Filters”, Journal of Mathematical Imaging and Vision, 21(l):53-80, July 2004. It is presently preferred to use a Difference of Gaussians (DoG) filter because this allows easy recombination of the spatial frequency bands. Similarly the number S of spatial frequency bands and the bandwidths may be freely chosen to suit the nature of the image. In the tests of the method reported below, the number S of band-pass filters was 5 and the bandwidths were equal.
In step 5, a phase congruency feature is detected from each of the images 2. Each spatial frequency band corresponds to a different scale in the original images 2. The evolution of phase along different scales can be used as a clue to differentiate image features from noise. One of the possibilities that have been proposed for this purpose is phase congruency. This parameter quantifies phase change over different scales; a high value corresponds to a consistent phase value and is thus an indicator of an image feature.
Thus in step 5, in respect of each image there are derived feature measures P in respect of each pixel which feature measures are measures of a phase congruency feature.
In general, phase congruency may be defined in accordance with the following equation:
where n denotes the scale, fn is the n-th Fourier component of the original image signal. However, in accordance with the alternative way to calculate phase congruency disclosed in Morrone & Owens, “Feature detection from local energy”, Pattern Recognition Letters, 6:303-313, 1987, the feature measures P for each image may be calculated from the amplitudes of the monogenic signal As in each spatial frequency band in accordance with the following equation:
where As are the signal amplitudes of the monogenic signal at the different scales s and φs are the phase values at different scales s.
More specifically, the feature measures P for each image may be calculated from the amplitudes of the monogenic signal As in each spatial frequency band in accordance with the following equation:
where E(x,y) is the local energy, the symbol └.┘ denotes a “soft threshold” (i.e. the result equals the enclosed quantity if it is bigger than zero, and is zero otherwise), T is a threshold value used to minimize the effect of noise, and E is a small constant added to avoid division by zero. By way of example in the 2D case, E(x,y) is calculated as:
More details of the derivation of a phase congruency feature which may be applied to the present invention are given in Kovesi, “Phase congruency: A low-level image invariant”, Psychological Research, Springer-Verlag, Vol. 64, No. 2, 2000, pp 136-148.
In step 6, the degree of alignment between the normal to the phase congruency feature and the analysis beam is determined. In particular, there are derived in respect of each image alignment measures in respect of each pixel and in respect of each spatial frequency band or scale denoted by a subscript s. The alignment measures are derived from the orientation θs of the monogenic signal in each of the spatial frequency bands or scales s, as derived in step 4. In particular, the alignment measures Ms in respect of each spatial frequency band or scale s are calculated as Ms=cos(θs-φ), where φ is the angle of the analysis beam. θs and φ are both defined relative to the same axis which is fixed relative to the object being imaged but is otherwise arbitrary. In this way, Ms quantifies how well aligned are the ultrasound beam and the normal to the phase congruency feature at each pixel. It is also important to note that, in the present multi-scale approach, a pixel can have different orientations when studied at different scales. In this way, the alignment measures are considered at the scale at which the particular structure is defined. Of course the feature measure Ms is derived in respect of each image so may more completely be denoted by Mis where i indexes the images 2.
In step 7, relative weights λis are determined in respect of each image denoted by the subscript i and in respect of each spatial frequency band or scale s. Before describing the derivation of the relative weights λis in detail, an explanation of the basis of the derivation will be given.
When acquiring images 2 from different angles, important image structures can be much better defined in certain views, and in this case averaging reduces structure definition. The aim of the combination method is to maximize the information content of the combined images 2. This is done by discriminating between areas that contain well-defined features and areas which merely contain speckle, on the basis of the feature measures Pi of each image 2. The feature measures Pi are a suitable measure to discriminate speckle from anatomical structures, because as described above well-defined features can be identified by small scale-space change behaviour, while in the case of speckle the phase can suffer important variations. Accordingly, the relative weights λs are determined in correspondence with the feature measures Pi of each image 2.
Furthermore, account is taken of the alignment measures Mis. It is well known that the backscattered energy and thus the ultrasound image appearance depend on the alignment between the analysis beam and the normal to the feature at the incidence point. Averaging the intensities acquired from two different views is not an optimal solution, as it would degrade the strong echoes generated by a small incidence angles by introducing weaker echoes from more oblique incidences. Accordingly the relative weights λis are determined taking into account the alignment measures Mis to positively weight images 2 in correspondence with the alignment measure Mis.
The relative weights λis are determined in step 7 based on the principles set out above in the following manner. Based on these principles, the following rules are applied separately to each pixel, to identify the characteristics of the pixel, and thus select the best strategy for combining the images 2 at the pixel:
If the feature measure Pi of one image 2 is relatively high but the feature measures Pi of the other images 2 are relatively low, the image 2 having a high feature measure Pi should predominate. In this case, the relative weights λis are determined in correspondence with the feature measures Pi for the plurality of images.
If the feature measures Pi of a group of plural images 2 are relatively high, the images 2 having a high feature measure Pi should predominate over the images having a low feature measure Pi if there are any. In this case, the relative weights λis are determined in correspondence with the feature measures Pi for the plurality of images 2 biased by the alignment measures Mis for the group of plural images 2 (so that those images 2 in which the feature at the pixel is better aligned will contribute a higher amount to the combination).
If the feature measures Pi of all the images 2 are relatively low, the pixel is treated as speckle, so the average value should be taken. In this case, the relative weights λis are made equal. Optionally, speckle may be reduced by multiplying the average value by a factor α which can be selected to have any value in the range 0 ≦α≦1
Thus, the feature measures Pi are treated as the primary condition, and only in the case that it is not sufficient to determine whether a given pixel should be used, the alignment measures Mis are considered.
Whilst the conditions described above could be applied to select one of a set of discrete values for the relative weights λis, in step 7 the relative weights λis are actually calculated as a function of the feature measures Pi and the alignment measures Mis to provide a continuum of values for the relative weights λis. In particular, the feature measures Pi are treated as probabilities that the feature is present at the pixel and the alignment measures Mis are treated as probabilities that there is alignment between the normal to the feature and the analysis beam. On this basis, the feature measures Pi and the alignment measures Mis are combined using probability rules.
As an example and for simplicity omitting the index s from the equation, the relative weight λ1 for the first image 2 (i=1) in the case that there are three images 2 to be combined is calculated in accordance with the equation:
As conventional in the field of probability, the bars over P and M represent (1-P) and (1-M), respectively. The relative weights λ2 and λ3 for image the other images 2 is given by cycling the indices for the three images 2. Similarly, the equation may be generalised for any number of images 2.
Although this equation is complicated, the four main parts in the four sets of square brackets can be understood as follows.
The first part represents the probability of the first image 2 being the only one containing non-significant structural information.
The second part represents the probability of having two images, one of them being the first image 2, contributing a non-speckle value and thus being weighted by their alignment values.
The third part represents the probability of having structural information in all three images 2.
Finally, the last part represents the probability of there being no significant structural information, e.g. pure speckle, at the pixel in question, that is the probability that no image 2 contains structural information, so the feature measures of all the images are low. The same term is present in the equivalent equation of the relative weight λi for each image 2 and so provides relative weights λi which are equal.
The coefficient α can be used for noise reduction and can be selected to have any value in the range 0 ≦α≦1. In particular, α=1 corresponds to an averaging of all images 2, that is the same effect produced by average compounding but in this case applied only to regions with no feature information. When α=1, the relative weights λi of all the images 2 sum to one. However, when α<1 the relative weights λi of all the images 2 sum to a value which is generally less than one, thereby reducing the amount of speckle in the combined image, and the extreme case that α=0 produces a total elimination of detected speckle. Thus the selection of the constant α allows control of the amount of speckle reduction depending on the application. For visual diagnosis, it can be dangerous to remove information from the original image, as important information could be there, so a high value of α is used. For automatic segmentation algorithms, a drastic reduction of the speckle content can be advisable. Another possibility would be to keep the speckle removed from one image and display it separately, as significant information about aspects such as motion could be obtained from it.
Finally, in step 8 a combined image 9 is produced from the images 2 in accordance with the relative weights determined in step 7. In particular, each pixel of each image 2 is weighted by its respective relative weight λi and the weighted images are summed. Furthermore this sum is performed in each spatial frequency band or scale s. Thus step 8 uses the spatial frequency bands of each image 2 derived using the same spatial frequency band-pass filters as used in step 3. Accordingly, in each spatial frequency band or scale s, the value Fs of each pixel is calculated as:
Fs(x,y)=Σλis(x,y).fis(x,y) (10)
and the value Fc of each pixel in the combined image 9 is calculated as:
Fc(x,y)=ΣFs(x,y) (11)
Optionally, equation (10) may be modified to incorporate a regularisation term which is a smoothing term that reduces the spatial variation, thus reducing noise. In particular, the value Fs of each pixel in the spatial frequency band or scale is calculated as the weighted linear combination set out in equation (10) plus the regularisation term. Thus the combined image is still produced by combining the pixels of each image 10 in accordance with the relative weights λi but there is an additional term in the combination equation (10). This would allow for noisy measurements and/or error in the combination by weighting that term against the regularisation term. With this option, the downside from a theoretical perspective is that the equation/model is then no longer purely probabilistic, but the upside is that it might work better in practice if noise is present.
While the method described above is presently preferred, many variations may be made within the scope of the present invention, for example as follows.
A particular advantage of the presented technique is that the framework is independent on the actual selection of the feature measures and the alignment measures used to quantify structural information and orientation. For other applications, it would be possible to introduce alternatives, while keeping the main ideas set out above. Use of the monogenic signal image constitutes a convenient framework because it provides both structural and geometric information, but is not essential. Other types of phase-based analysis could be performed. More generally, there maybe detected any feature which is invariant with local contrast, for example by performing some form of local normalisation of the images 2. Similarly other alignment measures are possible.
Furthermore the way in which the relative weights are determined from the feature measures and the alignment measures may be varied considerably. A simple variation would be to use equation (4) only with the terms using the alignment measures M of each image 2. More complicated variations would be to derive the relative weights from the feature measures and the alignment measures using an entirely different mathematical approach.
Another possible variation is not to use the multi-scale approach of processing the images 2 in each spatial frequency band or scale s. In this case a single alignment measure M would be obtained for each image 2 representative of the alignment over all scales or a range of scales. This might be appropriate to study features of interest known to have a particular scale, for example blood vessels.
The present method has been tested on some actual images 2 as will now be described.
A first test used synthetic phantom images. The results are shown in
As can be seen from visual examination of
An important indicator of the quality of the present method is its effect on the magnitude and direction of the intensity gradient at the object contours, this being a crucial parameter for edge-based segmentation methods. The intensity magnitude gradient in the direction normal to the ring contour has been calculated and shows increases of more than 30% where the differences in alignment are high.
A second test used RT3DUS images of the heart. The results are shown in
In summary, these results on simulated and real ultrasound images show a significant improvement of the present method over known averaging techniques, both in visual quality and quantitative measurements.
Number | Date | Country | Kind |
---|---|---|---|
0514715.2 | Jul 2005 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2006/002610 | 7/14/2006 | WO | 00 | 2/27/2008 |