The present description relates to a method and apparatus for media validation. It is particularly related to, but in no way limited to, such methods and apparatus which are able to react to improved quality counterfeit media such as passports, checks, banknotes, bonds, share certificates or other such media.
There is a growing need for automatic verification and validation of banknotes of different currencies and denominations in a simple, reliable, and cost effective manner. This is required, for example, in self-service apparatus which receives banknotes, such as self-service kiosks, ticket vending machines, automated teller machines arranged to take deposits, self-service currency exchange machines and the like. Automatic verification of other types of valuable media such as passports, checks and the like is also required.
Previously, manual methods of media validation have involved image examination, transmission effects such as watermarks and thread registration marks, feel and even smell of banknotes, passports, checks and the like. Other known methods have relied on semi-overt features requiring semi-manual interrogation. For example, using magnetic means, ultraviolet sensors, fluorescence, infrared detectors, capacitance, metal strips, image patterns and similar. However, by their very nature these methods are manual or semi-manual and are not suitable for many applications where manual intervention is unavailable for long periods of time. For example, in self-service apparatus.
There are significant problems to be overcome in order to create an automatic media validator. For example, many different types of currency exist with different security features and even substrate types. Within those different denominations also exist commonly with different levels of security features. There is therefore a need to provide a generic method of easily and simply performing currency validation for those different currencies and denominations.
Put simply, the task of a currency validator is to determine whether a given banknote is genuine or counterfeit. Previous automatic validation methods typically require a relatively large number of examples of counterfeit banknotes to be known in order to train the classifier. In addition, those previous classifiers are trained to detect known counterfeits only. This is problematic because often little or no information is available about possible counterfeits. For example, this is particularly problematic for newly introduced denominations or newly introduced currency.
In an earlier paper entitled, “Employing optimized combinations of one-class classifiers for automated currency validation”, published in Pattern Recognition 37, (2004) pages 1085-1096, by Chao He, Mark Girolami and Gary Ross (two of whom are inventors of the present application) an automated currency validation method is described (Patent No. EP1484719, US2004247169). This involves segmenting an image of a whole banknote into regions using a grid structure. Individual “one-class” classifiers are built for each region and a small subset of the region specific classifiers are combined to provide an overall decision. (The term, “one-class” is explained in more detail below.) The segmentation and combination of region specific classifiers to achieve good performance is achieved by employing a genetic algorithm. This method requires a small number of counterfeit samples at the genetic algorithm stage and as such is not suitable when counterfeit data is unavailable.
There is also a need to perform automatic currency validation in a computationally inexpensive manner which can be performed in real time.
Another problem relates to situations in which automatic currency validation systems are in place and are relatively successfully operating in a given environment. For example, that environment comprises a population of genuine and counterfeit banknotes with a given quality range and distribution. If sudden changes to that environment occur it is typically difficult for such automated currency validation systems to adapt. For example, suppose the new higher quality counterfeit banknotes suddenly begin to enter the banknote population. Police intelligence, manual validation and other information sources might indicate the presence of the higher quality counterfeit banknotes. In this situation, if a bank or other provider finds counterfeit notes are being accepted at automated currency validation machines, a commercial decision is typically made to stop using those machines. However, this is costly because manual validation needs to be made instead and customers are inconvenienced. Significant time and cost also needs to be invested to upgrade the automated currency validation systems to cope with the higher quality counterfeit banknotes.
Many of the issues mentioned above also apply to validation of other types of valuable media such as passports, checks and the like
A method of creating a classifier for media validation is described. Information from all of a set of training images from genuine media items only is used to form a segmentation map which is then used to segment each of the training set images. Features are extracted from the segments and used to form a classifier which is preferably a one-class statistical classifier. Classifiers can be quickly and simply formed for different currencies and denominations in this way and without the need for examples of counterfeit media items. A media validator using such a classifier is described as well as a method of validating a media item using such a classifier. In a preferred embodiment a plurality of segmentation maps are formed, having different numbers of segments. If higher quality counterfeit media items come into the population of media items, the media validator is able to automatically switch to using a segmentation map having a higher number of segments without the need for re-training.
The method may be performed by software in machine readable form on a storage medium. The method steps may be carried out in any suitable order and/or in parallel as is apparent to the skilled person in the art.
This acknowledges that software can be a valuable, separately tradable commodity. It is intended to encompass software, which runs on or controls “dumb” or standard hardware, to carry out the desired functions, (and therefore the software essentially defines the functions of the media validator, and can therefore be termed a media validator, even before it is combined with its standard hardware). For similar reasons, it is also intended to encompass software which “describes” or defines the configuration of hardware, such as HDL (hardware description language) software, as is used for designing silicon chips, or for configuring universal programmable chips, to carry out desired functions.
The preferred features may be combined as appropriate, as would be apparent to a skilled person, and may be combined with any of the aspects of the invention.
Embodiments of the invention will be described, by way of example, with reference to the following drawings, in which:
Embodiments of the present invention are described below by way of example only. These examples represent the best ways of putting the invention into practice that are currently known to the Applicant although they are not the only ways in which this could be achieved. Although the present examples are described and illustrated herein as being implemented in a banknote validation system, the system described is provided as an example and not a limitation. As those skilled in the art will appreciate, the present examples are suitable for application in a variety of different types of media validation systems, including, but not limited to, passport validation systems, check validation systems, bond validation systems and share certificate validation systems.
The term “one class classifier” is used to refer to a classifier that is formed or built using information about examples only from a single class but which is used to allocate newly presented examples either to that single class or not. This differs from a conventional binary classifier which is created using information about examples from two classes and which is used to allocate new examples to one or other of those two classes. A one-class classifier can be thought of as defining a boundary around a known class such that examples falling out with that boundary are deemed not to belong to the known class.
First we obtain a training set of images of genuine banknotes (see box 10 of
We next create a segmentation map using information from the training set images (see box 12 of
Using the segmentation map we segment each of the images in the training set (see box 14 of
A classifier is then formed using the feature information (see box 16 of
The method in
Using segmentation maps of different numbers of segmentations yields different results. In addition, as the number of segments increases, the processing required per banknote increases. In a preferred embodiment we therefore carry out trials during training and testing (if information about counterfeit notes is available) in order to select an optimum number of segments for the segmentation map.
This is indicated in
The method of
The optimum segmentation map and one or more other alternative segmentation maps are then stored (see box 19 of
It can be seen that, as the number of segments in the segmentation map increases, the chances of falsely accepting a counterfeit are reduced. However, there is a smaller increase in the risk of rejecting a genuine note.
In a preferred embodiment we select the fewest number of segments such that the false accept rate is almost zero. For example,
However, there may be a point during the life of the currency, where the quality of counterfeit banknotes increases. For example, the currency may become the target of a more organized counterfeit ring. Also, more advanced reprographic technology or techniques may become available. In this situation, counterfeit banknotes may be accepted as genuine by the automated system. This leads to an increase in the false accept rate as indicated in
This is illustrated in
By replacing the set of classification parameters in this way, retraining is not necessary. Thus a system for automatic currency validation can be quickly and simply adjusted to respond to introduction of higher quality counterfeit banknotes. This is described in more detail later in this document with reference to
More detail about examples of segmentation techniques is now given.
Previously in EP1484719 and US2004247169, (as mentioned in the background section) we used a segmentation technique that involved using a grid structure over the image plane and a genetic algorithm method to form the segmentation map. This necessitated using information about counterfeit notes, and incurring computational costs when performing genetic algorithm search.
The present invention uses a different method of forming the segmentation map which removes the need for using a genetic algorithm or equivalent method to search for a good segmentation map within a large number of possible segmentation maps. This reduces computational cost and improves performance. In addition the need for information about counterfeit banknotes is removed.
We believe that generally it is difficult in the counterfeiting process to provide a uniform quality of imitation across the whole note and therefore certain regions of a note are more difficult than others to be copied successfully. We therefore recognized that rather than using a rigidly uniform grid segmentation we could improve banknote validation by using a more sophisticated segmentation. Empirical testing that we carried out indicated that this is indeed the case. Segmentation based on morphological characteristics such as pattern, color and texture led to a better performance in detecting counterfeits. However, traditional image segmentation methods, such as using edge detectors, when applied to each image in the training set were difficult to use. This is because varying results are obtained for each training set member and it is difficult to align corresponding features in different training set images. In order to avoid this problem of aligning segments we used, in one preferred embodiment, a so called “spatio-temporal image decomposition”.
Details about the method of forming the segmentation map are now given. At a high level this method can be thought of as specifying how to divide the image plane into a plurality of segments, each comprising a plurality of specified pixels. The segments can be non-continuous as mentioned above. In the present invention, this specification is made on the basis of information from all images in the training set. In contrast, segmentation using a rigid grid structure does not require information from images in the training set.
For example, each segmentation map comprises information about relationships of corresponding image elements between all images in the training set.
Consider the images in the training set as being stacked and in registration with one another in the same orientation. Taking a given pixel in the note image plane this pixel is thought of as having a “pixel intensity profile” comprising information about the pixel intensity at that particular pixel position in each of the training set images. Using any suitable clustering algorithm, pixel positions in the image plane are clustered into segments, where pixel positions in those segments have similar or correlated pixel intensity profiles.
In a preferred example we use these pixel intensity profiles. However, it is not essential to use pixel intensity profiles. It is also possible to use other information from all images in the training set. For example, intensity profiles for blocks of 4 neighboring pixels or mean values of pixel intensities for pixels at the same location in each of the training set images.
A particularly preferred embodiment of our method of forming the segmentation map is now described in detail. This is based on the method taught in the following publication “EigenSegments: A spatio-temporal decomposition of an ensemble of images” by Avidan, S. Lecture Notes in Computer Science, 2352: 747-758, 2002.
Given an ensemble of images {Ii}i=1, 2, . . . , N which have been registered and scaled to the same size r×c, each image Ii can be represented by its pixels as [a1i, a2i, . . . , aMi]T in vector form, where aji(j=1, 2, . . . , M) is the intensity of the jth pixel in the ith image and M=r·c is the total number of pixels in the image. A design matrix A ε M×N can then be generated by stacking vectors Ii (zeroed using the mean value) of all images in the ensemble, thus A=└I1, I2, . . . , IN┘. A row vector └aji, aj2, . . . , ajN┘ in A can be seen as an intensity profile for a particular pixel (jth) across N images. If two pixels come from the same pattern region of the image they are likely to have the similar intensity values and hence have a strong temporal correlation. Note the term “temporal” here need not exactly correspond to the time axis but is borrowed to indicate the axis across different images in the ensemble. Our algorithm tries to find these correlations and segments the image plane spatially into regions of pixels that have similar temporal behavior. We measure this correlation by defining a metric between intensity profiles. A simple way is to use the Euclidean distance, i.e. the temporal correlation between two pixels j and k can be denoted as
The smaller d(j,k), the stronger the correlation between the two pixels.
In order to decompose the image plane spatially using the temporal correlations between pixels, we run a clustering algorithm on the pixel intensity profiles (the rows of the design matrix A). It will produce clusters of temporally correlated pixels. The most straightforward choice is to employ the K-means algorithm, but it could be any other clustering algorithm. As a result the image plane is segmented into several segments of temporally correlated pixels. This can then be used as a map to segment all images in the training set; and a classifier can be built on features extracted from those segments of all images in the training set.
In order to achieve the training without utilizing counterfeit notes, one-class classifier is preferable. Any suitable type of one-class classifier can be used as known in the art. For example, neural network based one-class classifiers and statistical based one-class classifiers.
Suitable statistical methods for one-class classification are in general based on maximization of the log-likelihood ratio under the null-hypothesis that the observation under consideration is drawn from the target class and these include the D2 test (described in Morrison, D F: Multivariate Statistical Methods (third edition). McGraw-Hill Publishing Company, New York, 1990) which assumes a multivariate Gaussian distribution for the target class (genuine currency). In the case of an arbitrary non-Gaussian distribution the density of the target class can be estimated using for example a semi-parametric Mixture of Gaussians (described in Bishop, C M: Neural Networks for Pattern Recognition, Oxford University Press, New York, 1995) or a non-parametric Parzen window (described in Duda, R O, Hart, P E, Stork, D G: Pattern Classification (second edition), John Wiley & Sons, INC, New York, 2001) and the distribution of the log-likelihood ratio under the null-hypothesis can be obtained by sampling techniques such as the bootstrap (described in Wang, S, Woodward, W A, Gary, H L et al: A new test for outlier detetion from a multivariate mixture distribution, Journal of Computational and Graphical Statistics, 6(3): 285-299, 1997).
Other methods which can be employed for one-class classification are Support Vector Data Domain Description (SVDD) (described in Tax, DMJ, Duin, RPW: Support vector domain description, Pattern Recognition Letters, 20(11-12): 1191-1199, 1999), also known as ‘support estimation’ (described in Hayton, P, Schölkopf, B, Tarrassenko, L, Anuzis, P: Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra, Advances in Neural Information Processing Systems, 13, eds Leen, Todd K and Dietterich, Thomas G and Tresp, Volker, MIT Press, 946-952, 2001) and Extreme Value Theory (EVT) (described in Roberts, S J: Novelty detection using extreme value statistics. IEE Proceedings on Vision, Image & Signal Processing, 146(3): 124-129, 1999). In SVDD the support of the data distribution is estimated, whilst the EVT estimates the distribution of extreme values. For this particular application, large numbers of examples of genuine notes are available, so in this case it is possible to obtain reliable estimates of the target class distribution. We therefore choose one-class classification methods that can estimate the density distribution explicitly in a preferred embodiment, although this is not essential. In a preferred embodiment we use one-class classification methods based on the parametric D2 test).
For example, the statistical hypothesis tests used for our one-class classifier are detailed as follows:
Consider N independent and identically distributed p-dimensional vector samples (the feature set for each banknote) x1, . . . , xN ε C with an underlying density function with parameters θ given as p(x|θ). The following hypothesis test is given for a new point xN+1 such that H0:xN+1 ∈ C vs.H1:xN+1 ∉ C, where C denotes the region where the null hypothesis is true and is defined by p(x|θ). Assuming that the distribution under the alternate hypothesis is uniform then the standard log-likelihood ratio for the null and alternate hypothesis
can be employed as a test statistic for the null-hypothesis. In this preferred embodiment we can use the log-likelihood ratio as test statistic for the validation of a newly presented note.
Feature vectors with multivariate Gaussian density: Under the assumption that the feature vectors describing individual points in a sample are multivariate Gaussian, a test that emerges from the above likelihood ratio (1), to assess whether each point in a sample shares a common mean is described in (Morrison, D F: Multivariate Statistical Methods.(third edition). McGraw-Hill Publishing Company, New York, 1990). Consider N independent and identically distributed p-dimensional vector samples x1, . . . , xN from a multivariate normal distribution with mean μ, and covariance C, whose sample estimates are {circumflex over (μ)}N and ĈN. From the sample consider a random selection denoted as x0, the associated squared Mahalanobis distance
D2=(x0−{circumflex over (μ)}N)TĈN−1(x0−{circumflex over (μ)}N) (2)
can be shown to be distributed as a central F-distribution with p and N−p−1 degrees of freedom by
Then, the null hypothesis of a common population mean vector x0 and the remaining xi will be rejected if
F>Fα;p,N−p−1, (4)
where Fα;p,N−p−1 is the upper α·100% point of the F-distribution with (p,N−p−1) degrees of freedom.
Now suppose that x0 was chosen as the observation vector with the maximum D2 statistic. The distribution of the maximum D2 from a random sample of size N is complicated. However a conservative approximation to the 100α percent upper critical value can be obtained by the Bonferroni inequality. Therefore we might conclude that x0 is an outlier if
In practice, both equations (4) and (5) can be used for outlier detection.
We can make use of the following incremental estimates of the mean and covariance in devising a test for new examples which do not form part of the original sample when an additional datum xN+1 is made available, i.e. the mean
and the covariance
By using the expression of (6), (7) and the matrix inversion lemma, Equation (2) for an N-sample reference set and an N+1'th test point becomes
D2=σN+1TĈN+1−1σN+1, (8)
where
Denoting (xN+1−{circumflex over (μ)}N)TĈN−1(xN+1−{circumflex over (μ)}N) by DN+1,N2, then
So a new point xN+1 can be tested against an estimated and assumed normal distribution for a common estimated mean {circumflex over (μ)}N and covariance ĈN. Though the assumption of multivariate Gaussian feature vectors often does not hold in practice, it has been found an appropriate pragmatic choice for many applications. We relax this assumption and consider arbitrary densities in the following section.
Feature Vectors with arbitrary Density: A probability density estimate {circumflex over (p)}(x; θ) can be obtained from the finite data sample S={x1, . . . , xN}∈d drawn from an arbitrary density p(x), by using any suitable semi-parametric (e.g. Gaussian Mixture Model) or non-parametric (e.g. Parzen window method) density estimation methods as known in the art. This density can then be employed in computing the log-likelihood ratio (1). Unlike the case of the multivariate Gaussian distribution there is no analytic distribution for the test statistic (λ) under the null hypothesis. So to obtain this distribution, numerical bootstrap methods can be employed to obtain the otherwise non-analytic null distribution under the estimated density and so the various critical values of λcrit can be established from the empirical distribution obtained. It can be shown that in the limit as N→∞, the likelihood ratio can be estimated by the following
where {circumflex over (p)}(xN+1;{circumflex over (θ)}N) denotes the probability density of xN+1 under the model estimated by the original N samples.
After generating B sets bootstrap of N samples from the reference data set and using each of these to estimate the parameters of the density distribution {circumflex over (θ)}Ni, B bootstrap replicates of the test statistic λcriti, i=1, . . . , B can be obtained by randomly selecting an N+1'th sample and computing {circumflex over (P)}(xN+1;{circumflex over (θ)}Ni)≈λcriti. By ordering λcriti in ascending order, the critical value α can be defined to reject the null-hypothesis at the desired significance level if λ≦λα, where λα is the jth smallest value of λcriti, and α=j/(B+1).
Preferably the method of forming the classifier is repeated for different numbers of segments and tested using images of banknotes known to be either counterfeit or not. The number of segments giving the best performance and its corresponding set of classification parameters are selected. We found the best number of segments to be from about 2 to 15 for most of currencies although any suitable number of segments can be used.
Optionally the apparatus for creating the classifier also comprises a selector which selects an optimum segmentation map and/or associated set of classification parameters as well as one or more alternative segmentation maps and/or associated sets of classification parameters by evaluating the classification performance of each.
The information or received instruction triggers activation of an alternative stored segmentation map (see box 51). This segmentation map has a different number (usually a higher number of segments) than the segmentation map previously used. This alternative segmentation map can either be stored in a self-service apparatus locally beforehand, or stored in a server centrally then distributed to the affected apparatus over the network remotely when necessary. Once the alternative segmentation map is activated, replacing the previous segmentation map the method proceeds as described with reference to
Whilst the alternative segmentation map is being used it is possible for developers to create a new segmentation map to combat the counterfeit attack which uses a lower number of segments than the alternative segmentation map. Thus the use of the alternative template allows the automatic currency validation process to proceed whilst any retraining, template development, and distribution of the resulting material takes place.
In the method described above, only one alternative segmentation map is created and stored. However, it is possible to create and store a plurality of such alternative segmentation maps with different numbers of segments. It is then possible to select which of the alternative segmentation templates to use on a trial and error basis, or on the basis of previous experience, and/or detailed information about the particular counterfeit attack being experienced.
Also, the methods described herein have focused on situations where the number of segments increases. However, it is also possible for the number of segments to decrease. For example, suppose that an alternative template is being used with 15 segments. This incurs a relatively high processing cost and burden. Later, the source of the counterfeit notes is prevented such that it is possible to return to a segmentation template having fewer segments.
Previously, segmentation has been based on spatial position alone and we improve on this by basing segmentation on feature values such as pixel intensity profiles across images in the training set. In this way each training set image has an influence on segmentation. However, previously, when grid segmentation has been used this is not the case.
The means for accepting banknotes is of any suitable type as known in the art as is the imaging means. A feature selection algorithm may be used to select one or more types of feature to use in the step of extracting features. Also, the classifier can be formed on the basis of specified information about a particular denomination or currency of banknotes in addition to the feature information discussed herein. For example, information about particularly data rich regions in terms of color or other information, spatial frequency or shapes in a given currency and denomination.
The methods described herein are performed on images or other representations of banknotes, those images/representations being of any suitable type. For example, images on any of a red, blue and green channel or other images as mentioned above.
The segmentation may be formed on the basis of the images of only one type, say the red channel. Alternatively, the segmentation map may be formed on the basis of the images of all types, say the red, blue and green channel. It is also possible to form a plurality of segmentation maps, one for each type of image or combination of image types. For example, there may be three segmentation maps one for the red channel images, one for the blue channel images and one for the green channel images. In that case, during validation of an individual note, the appropriate segmentation map/classifier is used depending on the type of image selected. Thus each of the methods described above may be modified by using images of different types and corresponding segmentation maps/classifiers.
Any range or device value given herein may be extended or altered without losing the effect sought, as will be apparent to the skilled person.
It will be understood that the above description of a preferred embodiment is given by way of example only and that various modifications may be made by those skilled in the art.
This application is a continuation-in-part application of U.S. patent application Ser. No. 11/366,147, filed on Mar. 2, 2006, which is a continuation-in-part application of U.S. patent application Ser. No. 11/305,537, filed on Dec. 16, 2005 now abandoned. Application Ser. No. 11/366,147, filed on Mar. 2, 2006 and application Ser. No. 11/305,537, filed on Dec. 16, 2005 are hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5048095 | Bhanu et al. | Sep 1991 | A |
5729623 | Omatu et al. | Mar 1998 | A |
6163618 | Mukai | Dec 2000 | A |
20030021459 | Neri et al. | Jan 2003 | A1 |
20030128874 | Fan | Jul 2003 | A1 |
20030217906 | Baudat et al. | Nov 2003 | A1 |
20040183923 | Dalrymple | Sep 2004 | A1 |
20040247169 | Ross et al. | Dec 2004 | A1 |
Number | Date | Country |
---|---|---|
1 484 719 | Dec 2004 | EP |
1484719 | Dec 2004 | EP |
1 217 589 | Feb 2007 | EP |
Number | Date | Country | |
---|---|---|---|
20070154099 A1 | Jul 2007 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11366147 | Mar 2006 | US |
Child | 11639576 | US | |
Parent | 11305537 | Dec 2005 | US |
Child | 11366147 | US |