1. Field of the Invention
This invention relates to methods and apparatus for analysis, classification and/or representation of images, and to the processing of signals representing images.
Certain visual characteristics of regions in images, relating to the regularity, coarseness or smoothness of the intensity/colour patterns are commonly referred to as texture properties. Texture properties are important to human perception and recognition of objects. They are also applicable for various tasks in machine vision, for example for automated visual inspection or remote sensing, such as analysing satellite or aerial images obtained from a variety of spectral image sensors.
Texture analysis usually involves extraction of characteristic texture features from images or regions; such features can be later used for image matching, region classification, etc.
2. Description of the Prior Art
European patent application EP-A-1306805 describes a method of classifying an image, in particular the texture properties of an image. The method involves deriving a feature vector representing the texture by mapping a two dimensional representation of an image into a one dimensional representation using a pre-determined mapping function, and then determining the level-crossing statistics of the representation; in particular (i) the rate at which the representation crosses a reference level or multiple reference levels (thresholds), (ii) the rate at which the representation changes when a reference level is crossed and (iii) the average duration for which the representation remains above (or below) a single reference level or multiple reference levels (thresholds).
A novel statistical texture descriptor, employing level-crossing statistics as described in EP-A-1306805 is presented in the paper “Texture Analysis using Level Crossing Statistics”, by C. Santamaria, M. Bober and W. Szajnowski, published in Proceedings of the 17th International Conference on Pattern Recognition (ICPR-2004), Cambridge UK, 23-26 Aug. 2004, referred to herein as “ICPR-04”. The extraction of the descriptor (i.e. the statistical analysis of the level-crossing features) is performed within a moving window, using four reference levels determined from the minimum, maximum and the mean amplitude values of the signal within the analysis window; these four levels being 25%, 50%, 75% of the maximum value and the also the mean of the signal. In order to describe image properties exhibited at various spatial scales, three different segment lengths are used: 21, 201, and 1001 pixels. While the presented algorithms outperform those of the prior art in both texture-based image retrieval and texture-based image segmentation tasks, it has been found that the performance and properties can be improved even further.
Aspects of the present invention are set out in the accompanying claims.
According to a further, independent aspect of the invention, a one-dimensional representation of an image is decomposed into component signals which represent different parts of the frequency bands occupied by the representation.
Preferably, the decomposition is achieved by filtering the representation with the use of a low-pass filter, preferably with a substantially constant and known (preferably zero) group delay, and subtracting the filtered signal (i.e. the signal at the output of the filter) from the original signal. This procedure is repeated several times to obtain several band-limited component signals. Accordingly, in the preferred embodiment, a plurality of low-pass filters with different filter characteristics are used to generate a plurality of different versions of the representation extending to different upper frequencies, and each version is subtracted from the version which has the next-higher upper frequency to provide a respective component function.
In a preferred embodiment of the invention, a filter with exactly zero group delay is obtained by applying a recursive filtering backwards and forwards, enabling the use of simple kernels.
In image analysis, it is often required to describe image properties at various spatial resolutions. The ICPR-04 implementation achieves that by using three signal analysis windows of different lengths: 21, 201, and 1001 pixels. The window lengths were determined experimentally (i.e. optimised for the texture analysis applications), which is a very time-consuming process and cannot guarantee optimum performance for all different applications and image data. It is known that a band-pass signal can be adequately represented by a set of its zero crossings under certain conditions (for example, if the ratio of the upper cut-off frequency to the lower cut-off frequency is between two and three, depending on the characteristics of the signal of interest). The aspect of the present invention mentioned above enables multi-resolution representation of an image without the need to use multiple analysis windows. This is achieved by applying the novel signal decomposition into bandwidth-limited component signals and applying the level-statistics analysis to each component signal.
In a preferred embodiment, the number of component band-limited signals (or channels) used for image analysis can be determined adaptively by determination of the signal energy in the lowest frequency band. As a result the number of the multi-resolution “scales” and their spatial extent is defined adaptively, depending on the properties of the image.
In the approach presented in EP-A-1306805 and the ICPR-04 paper, the crossing levels used for texture analysis are selected in an ad-hoc manner. The image intensity or colour components form a non-negative function with asymmetric statistical distribution of amplitude and with often non-stationary behaviour and therefore crossings of any single fixed level or multiple fixed levels do not provide an adequate image signal characterisation. This problem can be overcome, or at least mitigated, by using the aspect of the invention mentioned above. In particular, in a preferred embodiment, a low-pass filter which is used to derive a bandwidth-limited component signal is also used to define an adaptive reference level for another component signal.
A further, independent aspect of the present invention relates to the mapping function which is used to map the image to a one-dimensional function. According to this further aspect, the mapping is achieved using a new type of closed-contour scanning pattern with local region-filling properties
A primary image to be processed is described in a Cartesian co-ordinate system by a non-negative function f(x,y)≧0, where x and y are two spatial variables. For convenience, it may be assumed that both the spatial variables x and y are bounded as follows: 0<x<xmax and 0<y<ymax. For the purpose of digital processing, the primary image f(x,y) is represented by its samples f(xi,yj), determined at prescribed discrete locations {xi,yj: i=1, 2, . . . , I; j=1, 2, . . . , J}. Usually, those locations form a distinct and regular pattern of points, such as square or hexagonal raster.
Preferably, according to the present aspect of the invention, the resulting image f(xi,yj); i=1, 2, . . . , I; j=1, 2, . . . , J, is mapped into a scanned image, being a function f(zk) of a single discrete spatial variable zk, k=1, 2, . . . , K, by employing a scanning curve with the following properties:
the scanning curve zk, k=1, 2, . . . , K, passes through each point (xi,yj); 1, . . . , I; j=1, . . . , J, only once; hence, K=I·J;
the scanning curve zk, k=1, 2, . . . , K, is closed; therefore, zK+1=z1, zK+2=z2, . . . , and repeated scans will result in producing an exact periodic extension of the scanned image function f(zk); and
preferably, the scanning curve zk, k=1, 2, . . . , K, will exhibit, at least some, properties of plane-filling curves, such as Hilbert or Peano curves.
The scanning curve can be constructed by suitably combining a primary plane-filling curve with a number of its replicas obtained by translation, rotation and/or mirror-imaging of the primary curve.
One of the benefits resulting from the use of a closed scanning curve is that the resultant signal can be regarded as periodic and therefore, when it is desired for the one-dimensional image representation to be filtered, e.g. by a recursive filter, the initial filter conditions can be determined either by using known methods or simply by using a suitable ‘run-in’ interval.
The aspects of the invention mentioned above are independently advantageous and usable separately. However, there are significant benefits in combining them, so that the use of a closed scanning curve enhances the operation of filters used to decompose a one-dimensional image to improve analysis thereof.
Although the invention is primarily described in the context of analysing texture represented by the grey levels of an image, the invention could additionally or alternatively be used to analyse other image characteristics, such as colour components, colour component differences and multi-spectral images, and the term “texture” as used herein should be interpreted accordingly.
Arrangements embodying the invention will now be described by way of example with reference to the accompanying drawings, in which:
An input image mapper (IIM) 100 employs a predetermined mapping function to represent grey-level values of a two-dimensional (2-D) input image received at input 210 by a one-dimensional (1-D) function produced at output 212, referred to as the target function, as will be described in more detail below.
A scale-invariant transformer (SIT) 101 uses a suitable logarithmic transformation to convert the target function at output 212 of the IIM 100 into a target-function representation, with values independent of the dynamic range of the 2-D input image. (The dynamic range of the input image may be affected by varying illumination conditions, changes in local sensitivity of an image sensor, etc.)
The output of the SIT 101 is applied to the input of a filter bank (FB) 400 which operates as described below to output in succession a plurality of component functions each representing a respective band-pass component of the target function provided by the SIT 101. It should be pointed out that all filters in the filter bank FB exhibit zero group delay.
The component functions are each separately delivered (as illustrated by bus 490) to a feature analysis block (FAB) 500 which, for each component function, derives a set of values characteristic of that function. The combined sets of values for the component functions form a feature vector representing the original image, and is sent to an image texture classifier (ITC) 107.
The image texture classifier (ITC) 107 processes jointly feature data available at its inputs to perform texture classification of the 2-D input image. The procedure used for texture classification may be based on partitioning of the entire feature space into a specified number of regions that represent texture classes of interest.
The system as described above is similar to that described in EP-A-1306805, except for the input image mapper (IIM) 100 and the filter bank (FB) 400, the operations of which will now be described in more detail.
The input image mapper (IIM) 100 transforms the two-dimensional representation of the image received at input 210 into a one-dimensional representation by using a closed scanning curve.
The above examples of the construction of closed scanning curves are presented for the purpose of illustration, and are not intended to be exhaustive or to limit the invention to the precise form disclosed. For example, in some applications, it will be advantageous to replace a square raster by a hexagonal one. In such cases, a closed scanning curve can be modified by applying a suitable ‘distortion’.
In a preferred embodiment, the scanning curve covers only a part of the image, and the image is scanned so that scanning patterns tessellate (tile) the image in a non-overlapping way. In another preferred embodiment for applications where “dense” image description is required, scanning with overlapping tiles is used. After the scanning, each tile produces a scanned image sequence (one dimensional signal), which is periodic. Each sequence is processed in a similar way to extract image descriptors for the corresponding image region.
The use of a closed scanning curve is beneficial for the filtering operation performed by the filter bank (FB) of
|zk−zk−1|;k=2, . . . , N,
between consecutive sampling points on the scanning curve is constant, the scanned image function f(zk) can be represented uniquely by a corresponding sequence fk. An example of such a sequence fk is shown in
The representation is fed to a summer 410 which provides a first component function output 420 containing only the higher frequencies of the target function representation. The target function representation is also fed to a first low-pass filter 430, to remove the higher frequencies, the output of this filter then being fed to a subtracting input of the summer 410, so that the lower frequencies are subtracted from the target function representation, and therefore only the higher frequencies remain in the first component function 420. The low pass filter 410 has a zero group delay.
The output of the low-pass filter 430 is also fed to a second summer, 412, which provides at its output a second component function 422 containing components of lower frequency than those present in the first component function 420. A further low-pass filter 432 receives the target function representation and has a lower cut-off frequency than the filter 430 and an output which is fed to a subtracting input of the summer 412, so that the component function 422 contains band-pass frequencies between the cut-off frequencies of the low-pass filters 430 and 432.
Additional channels containing further low-pass filters of progressively decreasing cut-off frequencies can be provided to obtain additional component functions of progressively lower band-pass frequencies. Thus, for example, the summer 414 can also receive the output from low-pass filter 434 and, at its subtracting input, an output of another filter, to provide a third component signal 424.
In practice, the filter bank 400 is implemented using a processor which receives, stores and processes data in such a way that the component functions 420, 422, etc. are generated at successive times. At least one, and preferably each, component function is separately fed to a power measuring unit (PMU) 450 to measure the power present in each component function. The measurements are passed to a control unit (CU) 460, which controls the number of channels, and thus the number of component functions, which are generated and can also be used in signal characterisation. In this way, the control unit 460 can terminate the filter operation when the most recent component function has a power less than a predetermined threshold. Accordingly, the number of channels and component functions is controlled adaptively.
In EP-A-1306805, the target function representation is split into window regions, within each of which the function is analysed by comparing it with constant-value reference levels. However, better characterisation can be obtained by using an adaptive reference level. A level varying in an adaptive manner is obtained by the low-pass filters 430, 432, etc. . . . .
The adaptive reference level is obtained from the scanned image sequence fk by performing the two following operations in each filter:
1. applying to the sequence fk the recursive filter
νk=γ1νk−1+(1−γ1)fk,0<γ1<1,k=1, 2, . . . , K
to generate an auxiliary sequence vk; and
2. applying to the auxiliary sequence vk the same recursive filter running backward, i.e.
f
(1)
K+1-k=γ1f(1)K+2-k+(1−γ1)νK+1-k,0<γ1<1,k=1, 2, . . . , K
to produce a sequence f(1)k, which represents the required adaptive reference level.
The main purpose of running the same recursive filter forward and backward is to obtain low-pass filtering with exactly zero group delay. An example of a sequence f(1)k is shown in
It should be pointed out that the two above operations can be implemented in any order. Furthermore, their implementation can be structured in such a way that they can be running concurrently.
The parameter γ1 is related to the exponential impulse response exp(−γ1τ) of a single-pole analogue low-pass filter. However, the combined operation of the recursive filter running forward and backward is equivalent to convolving a function being processed with a symmetric exponential impulse response of the form exp(−γ1|τ|). Because of the symmetry of the impulse response, the associated group delay will always be equal to zero.
In general, the use of a recursive filter requires determining its initial conditions, such as v0 and f(1)K+1. This is facilitated by the use of a closed scanning curve, permitting the filter to use values spanning the boundary between the start and end points of the curve without encountering discontinuity; i.e. the closed scanning curve effectively produces a periodic sequence to be processed. It is thus possible to determine the initial conditions exactly by applying methods known to those skilled in the art. However, a much simpler and practical approach will be to use a suitable ‘run-in’ interval before the notional start of the period of a sequence being processed.
For improved filtering operation, a different recursive filter can be used as follows
νk=2γ2νk−1−γ22νk−2+(1−γ2)2fk,0<γ2<1,k=1, 2, . . . , K
to generate an auxiliary sequence vk. A sequence f(1)k, representing the required adaptive reference level, will then be obtained by running the filter backward through the auxiliary sequence vk.
The combined operation of the improved recursive filter running forward and backward is equivalent to convolving a function being processed with an impulse response of the form (1+γ2|τ|)exp(−γ2|τ|). Also in this case, because of the symmetry of the impulse response, the associated group delay is equal to zero.
The equivalent impulse response of the filter closely resembles a Gaussian response. The equivalent impulse response can approximate a Gaussian response even better, when the sequence being processed is subjected to successive multiple passes (forward and backward) of either the same recursive filters or different recursive filters.
All the operations described above are carried out repeatedly, using further low-pass filters to obtain further component functions; each time the preceding adaptive reference level is treated as the original primary scanned image sequence fk. However, each time the new filter parameters are modified as follows: γ1new=(γ1old)κ and γ2new=(γ2old)κ, where ⅓≦κ≦½. The entire process is terminated when the last adaptive reference level assumes almost a constant value. In most practical applications, the number of replications of the entire process will be limited to three or four.
The low-pass filters are preferably designed so that the ratios of the cut-off frequencies of the filters of “adjacent” channels is between two and three, because this often provides efficient coverage of the target function spectrum. The number of processor channels employed depends on the spectral properties of the signal fk which in turn depends on the image properties. For typical images three channels provide a good trade-off between system complexity and performance. If desired, the system could have a fixed number (e.g. three) of channels, instead of an adaptively variable number.
The component functions are delivered to the feature analysis block (FAB) 500. The analysis of the signal performed in the feature analysis block (FAB) 500 will depend on the application. For example for image texture analysis it may include extraction of the zero crossings of the component functions and computing statistical features (descriptors) from the zero crossings and/or the signal at zero crossing points. The illustrated embodiment uses a crossing rate estimator (CRE) 510, a crossing slope estimator (CSE) 520 and a sojourn time estimator (STE) 530 which operate in the same manner as the corresponding units disclosed in EP-A-1306805. In this case, though, the estimators are operating on signal sequences from which adaptive reference levels have been subtracted, so the estimators operate using a zero reference level, as detected by a zero-crossing detector (ZCD) 540. The above examples of the features extracted from the signal are presented for the purpose of illustration only, and are not intended to be exhaustive or to limit the invention.
Many modifications and variations exist and will enable those skilled in the art to utilize the invention in various embodiments suited to the particular task contemplated. For example, although the filtering operation described above was performed to decompose a target representation, the benefits of using a closed scanning curve would also be obtained in circumstances in which the filtering is used for a different purpose, such as noise reduction (e.g. using a non-linear median filter) or edge enhancement (e.g. using a high-pass filter). Also, although the filters mentioned above are recursive, it would be possible alternatively to use Finite Impulse Response filters; again, similar benefits will accrue from the use of closed scanning curves. Although the invention has been described in the context of analysis of two-dimensional images, the techniques can be extended to analysis of multidimensional data, and in particular multidimensional images, by employing suitable space-filling curves. The image may be a conventional visual image, or may be an image in a non-visual part of the electromagnetic spectrum, or indeed may be in a different domain, such as an ultrasound image.
Number | Date | Country | Kind |
---|---|---|---|
05251539.2 | Mar 2005 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2006/000908 | 3/14/2006 | WO | 00 | 3/25/2008 |