1. Field of the Invention
The present invention relates to a method and apparatus for analysing a plurality of stored images.
2. Description of the Prior Art
Techniques have been derived for indexing and searching textual information items, or at least items having some textual content. An example of such a technique is to generate feature data from the textual item (e.g. word distribution) and to allow comparisons between items to be made on the basis of a comparison of the feature data.
With image items, however, few useful techniques have been proposed.
One simple technique is to associate some text with an image. This could be as simple as a title, or could involve more detailed “metadata” such as a paragraph of description, a schedule of items or people in the image, a time of capture of the image, a schedule of those involved in its capture, and so on. Text-based searching techniques can then be used to identify similar images. But providing accurate and useful metadata is time-consuming, and computationally expensive.
Other techniques establish feature data based on properties of the images themselves. These might include colour properties, texture properties and the like. But this is also limited because two images, which to a human observer represent the same thing, may have very different image properties. For example, a pair of images of a particular person might have very different image properties because the image backgrounds are different.
Also, image processing, generally, is slow. However, users wish to conduct the searching and analysing of images quickly. In order to increase the speed, the computational resource is increased. This then increases the cost and complexity of systems.
It is the aim of the present invention to address these problems.
In one aspect of the present invention there is provided a method of analysing a plurality of stored images comprising the steps of:
dividing each of the plurality of the stored images into a plurality of segments;
deriving a plurality of sets of segments, each set comprising different combinations of said segments to the other sets;
deriving feature data corresponding to a property of each set of segments; and
storing said derived feature data in association with said stored image.
The technique embodied in the present invention can reduce the period of time and/or computational processing required to search a plurality of images based for example on an initial search criteria specified by a user. This is because once the search criteria has been established, sets of segments from each image of the plurality of images need only be searched as opposed to the entire image. Furthermore, in some embodiments characteristics of the search criteria, such as the shape, size or orientation of an area to be searched can be used to identify appropriate sets of segments from the plurality of images which may contain feature data which is particularly relevant to a given search.
In some embodiments, the method comprises generating an ordered list of stored images, the order of the stored images being determined in accordance with the result of the comparison between the derived feature data from the first image and the stored derived feature data corresponding to the stored image.
Additionally, in some embodiments, the feature data of the first image and the stored feature data are derived independently of the size and position of the defined area.
In some embodiments the feature data is representative of the colour properties of the each set of segments.
In some embodiments the stored images are divided into a plurality of substantially equally sized segments.
In some embodiments the segments are quadrilaterally shaped.
In some embodiments the method of analysing further comprises:
defining an area in a first image;
deriving feature data corresponding to a property of the area defined in the first image, the property corresponding to the stored derived feature data; and
comparing said derived feature data from the first image with said stored derived feature data.
In some embodiments the defined area in the first image corresponds in shape to at least one of the stored sets of segments and the comparison is carried out on that or those stored sets having the corresponding shape.
In some embodiments the defined area in the first image is generated in response to a user input.
In some embodiments the method of analysing further comprises selecting at least one stored image in accordance with the result of the comparison.
In some embodiments the method of analysing further comprises: highlighting, to the user, the set of segments in the selected stored image having feature data, which following comparison resulted in the selection of said selected image.
According to another aspect, there is provided a system operative to search through stored images for similar images, the system comprising:
an area definer operable to define, in an image under test, an area around a foreground object, the area around the foreground object including at least part of the background of the image under test;
a feature data generator operable to generate feature data representative of a property of the foreground object and feature data representative of the background in the area; and
a comparison device operable to compare the generated feature data with other respective feature data representative of the foreground object and the background in stored images, and, in response to the comparison of the feature data, returning relevancy data for at least some of the feature data representative of the stored images, the relevancy data indicating a degree of relevance between respective feature data for the stored images to the feature data defined from the image under test.
Various further aspects and features of the invention are defined in the appended claims.
The above and other objects, features and advantages of the invention will be apparent from the following detailed description of illustrative embodiments which is to be read in connection with the accompanying drawings, in which:
In general terms, the image processing system is arranged such that a user may search through a large number of images from an image repository in order to identify images which correspond to various search criteria specified by the user. Typically the user will specify the search criteria by taking a first image and selecting parts or features of this first image. The first image (or the selected part or parts) will, in embodiments, be subject to processing. This processing will be described below. It should be noted here that the processing carried out on the first image may also be carried out on one or more of the images in the repository through which the searching will take place. The processing on the images in the repository may take place before the search is conducted (termed “pre-analysis”) or as the search through the images is carried out (termed “on the fly”). This processing will be explained later.
The image processing system will then search the image repository with reference to the parts or features of the first image selected by the user. For example, the user may wish to identify images from the repository including birds. In this case, the user selects a first image that includes a bird and selects the specific parts or features of the image which encapsulate a bird. After the search has been conducted, a list of images from the image repository will be generated. This identifies images in the repository which are deemed to be similar or contain similar elements to the parts or features of the first image selected by the user. This provides the user with the ability to pick out only features of an image that are relevant to them for the particular search. For instance, in this example, the beak of the bird may be selected and only images having similar beaks will be returned in the search. This makes more efficient use of computer resources because only relevant sections are returned to the user. Additionally, by searching only selected parts which are processed in the manner discussed below, the returned images are scale invariant. In other words, in the example above, it will not matter whether the beak is 20% of the image or 70% of the image; both will be returned as relevant. This improves the searching mechanism. In some embodiments the system will rank the images in the generated list by identifying those images which most closely match the selected search criteria.
The image repository may comprise a plurality of images stored within the system for example on the disk storage 30. Alternatively the image repository may be stored on some form of storage media which is remote from the system and which the system gains access to via some form of intermediate link such as the network interface card connected to the network 50. The images may be distributed over a number of storage nodes connected to the network 50.
The images may be in various forms for example “still” images captured by a camera or the images may be taken from a series of images comprising a video stream.
As noted above, the first image (or, in embodiments, the selected part) is subjected to image processing.
The Image Searching Mechanism
In order to search the images in the image repository, the image processing system undertakes the following steps:
A first image from which the search criteria are to be derived is selected. The image might be selected from the image repository or be a new image loaded onto the system from an external source via the network 50 or from a disk or other storage media attached to the system.
The image is typically presented to the user on the display device 60 and the user selects an area of the image using an input device such as the mouse 80. In some embodiments the image is segmented into a grid and the user selects one or more segments of the grid which contain the features of the image upon which the user bases the search. However, the invention is not so limited and a user can define their own area using the mouse 80 as noted below.
As noted above, in some embodiments at least some of the images from the image repository will be pre-analysed. The pre-analysis of the images in the repository reduces the processing load on the system at the time of searching and thus increases the speed at which the searching through images takes place. To further increase the speed with which the search is conducted, the pre-analysis of the images in the repository is carried out using a similar technique to that used to analyse the first image. Additionally, as part of the pre-analysis of the images in the repository, at least some of the pre-analysed images may be segmented into blocks for example by the application of a grid such as a 2×2 grid, 3×3 grid, 4×4 grid. Alternatively a non-square grid could be used such as a 2×3 or 3×4 grid. Individual blocks or groups of blocks may be analysed independently of the image as a whole therefore allowing not only images from the image repository to be searched but also different parts of each image from the image repository. Furthermore, the system may be operable to search for parts of images which correspond in shape to an area selected by the user, as described above. Thus if a user selects an area as shown in
In another embodiment as noted above, the user may simply define an area which contains the features of the images upon which the search is to be based. This is indicated in
In another embodiment of the invention the images from the image repository are divided into a plurality of sets of segments. The plurality of sets of segments which are stored on the image repository are analysed to derive feature data representing an attribute of each of the set of segments. The results of this analysis is then stored in association with the image.
The user can then select a set of segments from the first image corresponding for example to a feature of interest. The system is operable to search the sets of segments from the images of the image repository which correspond in some respect to the selected segments.
After the area containing the features of interest has been selected the search through the repository of images continues. In order to perform the search, the first image (or the selected part) needs to be subjected to processing.
Image Processing
In order to commence the search the system, in embodiments, performs a colour resolution reduction procedure on the image. As will be understood, each pixel of an image is typically defined by data representing pixel colour component values such as “R”, “G” and “B” values (defining red, green and blue components respectively) or colour encoding schemes providing colour component values such as “Y”, “CB” and “CR” (defining a “luma” value and “chroma” values respectively). Such values determine the colour of each pixel. The number of possible colours that can be used to provide pixel colours is determined by the number of bits used to represent the pixel colour component values. Typically this is 16 million colours although this is only exemplary. The colour resolution reduction procedure will typically involve a “down-sampling” or decimation operation on each colour component value the result of which is to reduce the total number of possible colours for a pixel. After the colour resolution reduction procedure has been applied to the image, the number of colours in the image will be reduced. An effect that arises in many images after a colour resolution reduction procedure has been applied is that the image is segmented into areas of the same colour. This effect manifests itself as lending an image a “blocky” appearance. A simplified example of this is shown in
After the colour resolution reduction procedure has segmented the image into a number of areas of identical colour, the image is further divided into a number of colour planes in which each plane comprises only the image elements of one colour. Thus the number of colour planes will be the same as the total number of colours in the image after the colour resolution reduction procedure. The division of the image into colour planes comprising image elements of each colour is shown in
Each plane is then analysed in order to derive feature data such as a feature vector corresponding to a property of the image element or elements contained therein. The property may relate to one or many aspects of the image element for example simple size or colour or more complex considerations such as the form of the shape of the elements. Furthermore, as will be understood, a feature vector is one example of an abstract measure of a property of the image element. Another example might be the sum of the absolute differences. In some embodiments the feature vector for one or more colour plane is generated by first detecting the edge pixels for each image element and then counting the pixels around the perimeter of each image element in the colour plane. Although detecting the edge pixels is discussed further below, known techniques such as blob analysis may be used. A mean of this perimeter value is then calculated producing a single scalar value for each colour plane. This procedure is repeated for each colour plane. The calculated mean scalar value for each colour plane is taken and a histogram produced. A simplified histogram is shown in
The histogram is then compared to similarly generated histograms for each of the images from the image repository.
There are many techniques for comparing the histogram derived from the first image with those similarly derived from the repository of images. In a very simple example corresponding bins of the two histograms can be aligned and the absolute difference between the histograms calculated. The result of this subtraction can be represented as a further histogram. The bins from the resulting histogram can be summed to produce a single value. The closer this value to zero, the more similar the histograms. A similar image in the repository is identified when the summed data is below a threshold. Although only a simple technique described for comparing histograms, the skilled person will appreciate that more sophisticated techniques exist.
The result of the histogram comparison will typically generate a number of “hits” corresponding to similar images from the image repository. These similar images can then be presented to the user on the display screen. As will be understood, the number of returned images can be controlled by specifying certain parameters. For example the system may be arranged to return the first 10 images with histograms which most closely correspond to that of the first image. Alternatively the system can be arranged to return all images the histograms of which meet a certain threshold level of similarity with the histogram derived from the first image, as noted above. In order to aid the user, the set of segments in the “hit” image which correspond to the set of segments selected by the user is outlined in the “hit” image.
In some embodiments the total number of pixels on the perimeter of each image element is counted in order to provide a feature vector for each colour plane. Methods known in the art for detecting edge pixels are typically computationally intensive and require pixel by pixel analysis. This often makes real time edge detection for high resolution images quite difficult. In some embodiments of the system, in the image processing method, the following edge detection technique is used. It is understood, that in other embodiments, a different edge detection technique may be used,
Edge Detection
The technique comprises replicating eight times the image to be tested for edge pixels. Each duplication is shifted (i.e. spatially transformed) by one pixel in each of the eight possible directions (i.e. x+1, y+0; x−1, y+0; x+0, y+1; x+0, y−1; x+1, y+1; x+1, y−1; x−1, y−1; x−1, y+1). An XOR function is then taken of all of the corresponding pixels from the eight transformed replicated images. The result of this XOR function is a binary matrix with a “1” indicating an edge pixel and a “0” indicating a non-edge pixel. A simplified version of this technique is illustrated in
1XOR1XOR0XOR0XOR0XOR1XOR1=1
Thus the pixel 76 being tested is shown to be an edge pixel.
1XOR1XOR1XOR1XOR1XOR1XOR1=0
Thus the pixel 91 being tested is shown to be a non-edge pixel.
As described above, once the XOR function has been carried out for every pixel in the area 71, a binary matrix with a “1” indicating an edge pixel and a “0” indicating a non-edge pixel is produced. This is shown in
As noted earlier, although the foregoing processing has been described in respect of the first image (or the selected part thereof), it is understood that in embodiments, the same or similar processing may be carried out on one or more of the images stored in the repository. This may form “pre-analysed” images or may be performed “on the fly”.
Other embodiments may be used in image restoration, for example to detect scratches in a digital representation of image material originally on film stock which has been scanned into digital formats. Other applications of the embodiments of the invention relate to general video processing. For instance, an object may be isolated from the image, processed and then replicated into the image. Processing might be for example colour correction or indeed other special effects. Another application may be to mark or tag an object within an image with a target hyper-link accurately. Systems for manually tagging faces in photographs often allow the user to define a face using a rectangle which may often overlap another face causing confusion from a user clicking on a hyper-link. Embodiments of the present invention may assist in more accurately defining a region to which the hyper-link may be assigned.
Although the foregoing processing describes the colour resolution reduction procedure as taking place on the whole image, it is envisaged that this could instead take place on only the selected part of the image. This would reduce processing load on the system.
Although some embodiments in the foregoing have been described with reference to finding feature data of segments (i.e. the foreground and background components are treated relatively equally), in some embodiments, it is possible to find feature data of a foreground object and feature data of a background in an image or part of an image (for example, a segment). Using this, in embodiments, the feature data of the foreground object will be generated. Additionally, feature data for a part of, or all of, the background in the segment will be generated. The generated feature data of both the foreground object and the background will then be compared with feature data of similar combinations of foreground feature data and background feature data in the stored images. As a result of this comparison, it is possible, in embodiments to generate a relevancy indicator which can be used to generate an ordered list. The most relevant stored images will be seen first by the user, in embodiments. This allows more relevant results to be returned to the user because the foreground object is seen in context. For instance, if the segment under test consists of an image of a beak in a wooded surrounding, a similar beak in a wooded surrounding is more relevant that a similar beak in a desert. Thus, this embodiment returns more relevant images.
In some embodiments the image to be tested may not be replicated and spatially transformed eight times (thus not allowing spatial transform to be applied for every possible one pixel displacement), rather the image may be replicated and spatially transformed fewer than eight times. Although this will give an incomplete analysis as to the presence of edge pixels, the information generated may be sufficient in some applications to provide enough information regarding edge pixels to be useful. As will be understood various modifications can be made to the embodiments described above without departing from the inventive concepts of the present invention. For example, although the present invention has been described with reference to a discrete computer apparatus, the invention could be implemented in a more distributed system operating across a number of connected computers. A server may store the images from the image repository and execute the search whilst a remote computer connected via a network connection to the server may specify the search criteria. This may be achieved by integrating parts of the system, for example the graphical user interface, into a “plug-in” for a web browser.
Although illustrative embodiments of the invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope and spirit of the invention as defined by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
0721405.9 | Oct 2007 | GB | national |