This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2007-183788, filed on Jul. 13, 2007; the entire contents of which are incorporated herein by reference.
The present invention relates to a pattern search apparatus and a method thereof for searching a nearest neighbor pattern in a training pattern set at high speed using a hash function on an input pattern.
A conventional approximate nearest neighbor search method using a hash function is disclosed in P. Indyk and R. Motwani, “Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality,” In Proceedings of the 30th ACM Symposium on Theory of Computing (STOC'98), pp. 604-613, May 1998) and M. Datar, P. Indyk, N. Immorlica, and V. Mirrokni, “Locality-Sensitive Hashing Scheme Based on p-Stable Distributions,” In Proceedings of the 20th Annual Symposium on Computational Geometry (SCG2004), June 2004). As shown in
As shown in
However, in a space area (hereinafter referred to as a bucket) obtained by division by the hash function, with respect to the training patterns existing in the bucket containing the input pattern as the search object, the number of patterns varies according to the distribution of the training pattern set, and in the bucket with a high training pattern density, a search time becomes long, and in the bucket with a low training pattern density, a ratio (hereinafter referred to as an error ratio) of errors of distances of the true nearest neighbor pattern and the obtained approximate nearest neighbor becomes high.
In the case where the training pattern does not exist in the bucket containing the input pattern as the search object, a search can not be made.
Accordingly, an advantage of the present invention is to provide a pattern search apparatus and a method thereof for enabling the speed-up of the search speed and the reduction of the error ratio.
In order to achieve the above-described advantage, a first aspect of the present invention is to provide a pattern search apparatus includes a storage unit configured to store a plurality of training patterns having d dimensions, a distribution acquisition unit configured to obtain a cumulative probability distribution representing a cumulative probability of existing probabilities of the respective training patterns on an arbitrary dimensional axis of the d dimensions, a hash function unit configured to obtain a hash function to convert a value of an arbitrary point in a distribution section on the arbitrary dimensional axis corresponding to, in the cumulative probability distribution, each probability section which is obtained by dividing the cumulative probability, to a hash value corresponding to each of the probability sections, a training unit configured to obtain hash values of the respective training patterns by using the hash function and to classify the respective training patterns into buckets corresponding to the hash values, and a search unit configured to obtain hash value of an input pattern by using the hash function and to search the training pattern most similar to the input pattern from the training patterns belonging to the bucket corresponding to the hash value of the input pattern.
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
(First Embodiment)
Hereinafter, a pattern search apparatus 1 of a first embodiment will be described with reference to
As shown in
The cumulative probability distribution of the training patterns on an arbitrary axis is approximated by a sigmoid function, and hash functions, each of which divides a probability value at constant intervals, are defined based on the cumulative probability distribution.
Then, when an unknown pattern is inputted, a sum of respective subsets of the training pattern set existing in the bucket is obtained by a hash value as the output of each hash function. A training pattern (hereinafter referred to as the nearest neighbor pattern) which is most similar to the input pattern is searched from the sum.
Pattern search apparatus 1 of this embodiment is used for pattern recognition, and can be applied to, for example, recognition of an object (e.g., face recognition) taken on an image or data mining.
Pattern search apparatus 1 includes a storage unit 2, a distribution acquisition unit 3, a hash function unit 4, a training unit 5, and a search unit 6.
Respective functions of units 2 to 6 described below can also be realized by a program stored in a computer-readable media which causes a computer to implement their functions.
Storage unit 2 has previously stored training patterns such as face images, which become search objects, as a training pattern set.
Then, the stored training patterns are sent to the distribution acquisition unit 3 and training unit 5.
The distribution acquisition unit 3 obtains a cumulative probability distribution on an arbitrary axis of the training pattern set stored in storage unit 2 by a principal component analysis.
In this embodiment, it is assumed that the training pattern set is a normal distribution (Gaussian distribution), and the cumulative probability distributions on all dimensions are obtained. In case the cumulative probability distribution on the d dimension is estimated and the distribution is obtained, it is approximated by a sigmoid function of following expression (1)
Here, μ denotes a mean value, and α denotes a standard deviation. With respect to the two parameters, a mean value and a standard deviation of the training pattern set are used as the initial values, and optimization of fitting of the sigmoid function to the cumulative probability distribution is performed by the least squares approximation.
The respective acquired cumulative probability distributions are sent to hash function unit 4.
Hash function unit 4 obtains a hash function to divide each of the cumulative probability distributions obtained by the distribution acquisition unit 3 into n parts so that the probability values become uniform.
The hash function can be defined by following expression (2).
hd(x)=└Psd(x)/Δ┘ (2)
Here, Δ denotes a probability value of one bucket when a value area [0, 1] of Ps(x) is divided into n parts like [0, Δ], [Δ, 2Δ], ••• [(n−1) Δ, 1].
The hash function on each dimension is acquired and is sent to training unit 5 and search unit 6.
Training unit 5 distributes the respective training patterns belonging to the training pattern set to respective buckets corresponding to hash values as output values of the respective hash functions based on the training pattern set stored in storage unit 2 and the respective hash functions acquired by hash function unit 4, constructs subsets, and forms a list structure.
In the case where the dimension number of the pattern is D, and the hash function makes a division into n parts, Dn subsets are created in total.
The created subsets are sent to search unit 6.
First, an unknown pattern q, such as a face image, to be retrieved is inputted to search unit 6. Then, with respect to input pattern q, the most similar training pattern (that is, the nearest neighbor pattern) is searched from the training pattern set ALL.
The search is made based on the respective hash functions acquired by hash function unit 4 and the subsets created by training unit 5.
When input pattern q is made the input value, a subset BiH limited by hash value H as the output of the hash function can be defined by following expression (3).
BiH={x|xεALL,hi(x)=H} (3)
Here, a sum of subsets obtained by m hash functions selected at random is made a nearest neighbor candidate C(q)
At this time, when the nearest neighbor candidate C(q) is acquired, the number of contained buckets is made a multiplicity w(x), sorting is performed by this, and a distance between the maximum pattern and the input pattern is made a provisional distance z. When an Lp distance between x1 and x2 is defined as dp(x1, x2), the provisional distance z is expressed by following expression (5).
Here, since a candidate x satisfying following expression (6) can not become the nearest neighbor, the candidate can be removed from the nearest neighbor candidate C(q).
Based on this property, a search refinement is performed by the following procedure.
At step 1, i is made 1.
At step 2, with respect to the respective elements in Ci-1(q), a candidate satisfying the condition of expression (6) is removed, and an obtained candidate set is made Ci(q). When i=m, advance is made to step 4. When |Ci(q)|=1, the element is made the nearest neighbor pattern and a stop is made.
At step 3, i is made “i+1”, and return is made to step 2.
At step 4, distance calculation is performed for all candidates in Ci(q) and the input pattern, and the training pattern to give the minimum distance is made the nearest neighbor pattern and a stop is made.
The finally acquired nearest neighbor pattern is outputted to the outside of the pattern search apparatus 1.
In the above embodiment, in search unit 6, the nearest neighbor candidate C(q) is rearranged by the multiplicity w(x), and a search may be made for only the upper b %.
In the above embodiment, although the sum of the subsets obtained by the hash functions in search unit 6 is made the nearest neighbor candidate C(q), a product set of the subsets obtained by the hash functions in search unit 6 or a sum near the product set is made the nearest neighbor candidate C(q) and a search may be performed.
According to the prefered embodiments of the invention, since the hash function to divide the probability value at constant intervals is defined from the cumulative probability distribution of the training patterns, the number of patterns in each bucket becomes almost constant, and it becomes possible to increase the average search speed at the time when the nearest neighbor solution is obtained and to reduce the error ratio.
(Second Embodiment)
A pattern search apparatus 101 of a second embodiment of the invention will be described with reference to
Pattern search apparatus 101 of this embodiment is different from pattern search apparatus 1 of the first embodiment in the following two points.
The first different point is that in an orthogonal transformation unit 107, each training pattern of a training pattern set is projected on a subspace, is orthogonally transformed into a Euclidean distance in a vector space, and the orthogonally transformed training pattern is stored in a storage unit 102.
The second different point is that in a distribution acquisition unit 103, a cumulative probability distribution is acquired by weighting addition of sigmoid functions with respect to the respective training patterns.
Pattern search apparatus 101 includes storage unit 102, distribution acquisition unit 103, a hash function unit 104, a training unit 105, a search unit 106, and orthogonal transformation unit 107.
Incidentally, in the operation of pattern search apparatus 101, with respect to a similar process to that of pattern search apparatus 1 of the first embodiment, a description thereof will be omitted.
Orthogonal transformation unit 107 has previously performed principal component analysis of a training pattern set as a retrieval object. Eigenvectors up to the Nth one for performing orthogonal transformation in descending order of eigenvalue, that is, an “N×D” orthogonal transformation matrix O={φT1, φT2, ••• φTN} is stored and is sent to storage unit 102.
Storage unit 102 uses the eigenvector acquired by orthogonal transformation unit 107, and orthogonally transforms each in the training pattern set as the retrieval object in advance, and then stores it as the training pattern set.
Then, the orthogonally transformed training pattern set is sent to distribution acquisition unit 103 and training unit 105.
Distribution acquisition unit 103 obtains a cumulative probability distribution on each axis of the training pattern set stored by storage unit 102.
Here, the cumulative probability distribution is made a general distribution, and the general distribution is assumed to be the Gaussian distribution, and is estimated by weighting addition of the sigmoid function. Actually, with respect to training patterns selected at random, h in each of the training patterns is optimized by following expression (7), and the cumulative probability distribution is made a function by the least squares approximation.
Here, h′ is a mean value of hi. The number of n is given as a constant, or the minimum number within the allowable range of approximation accuracy is obtained by a general optimization method.
First, an unknown input pattern q, such as a face image, to be retrieved is inputted to search unit 106. A search is made for the nearest neighbor pattern in the training pattern set ALL with respect to the input pattern q. This search is made based on each hash function acquired by hash function unit 104 and the subset generated by training unit 105.
The input pattern is orthogonally transformed by the eigenvector acquired by orthogonal transformation unit 107, and is made an orthogonal transformation pattern OTq. When the orthogonal input pattern is made an input value, a subset BiH limited by the hash value H as the output of the hash function can be defined by following expression (8).
BiH={x|xεALL,hi(x)=H} (8)
Here, a sum of subsets obtained by m hash functions selected at random is made the nearest neighbor candidate C(q)
At this time, when the nearest neighbor candidate C(q) is acquired, the number of contained buckets is made a multiplicity w(x), sorting is made by this, and a distance between the maximum pattern and the input pattern is made a provisional distance z. When an Lp distance between x1 and x2 is defined as dp(x1, x2), the provisional distance z is expressed by following expression (10).
Here, since a candidate x satisfying following expression (11) can not become the nearest neighbor, the candidate can be removed from the nearest neighbor candidate C(q).
Based on this property, a search refinement is performed by the following procedure in sequence from a projection component on a principal axis with a high eigenvalue.
At step 1, i is made 1.
At step 2, with respect to each element in Ci-1(q), a candidate satisfying the condition of expression (11) is removed, and the obtained candidate set is made Ci(q). When i=m, advance is made to step 4. When |Ci(q)|=1, the element is made the nearest neighbor pattern and a stop is made.
At step 3, i is made “i+1” and return is made to step 2.
At step 4, distance calculation is performed for all candidates in Ci(q) and the input pattern, and the training pattern to give the minimum distance is made the nearest neighbor pattern and a stop is made.
The reason why such calculation is performed is that the first principal component with the highest eigenvalue is the axis with the largest dispersion, and almost all elements of C(q)−B1h1(q) satisfy the condition of |φT1q−φTx|P>zP, and therefore, the nearest neighbor candidate set can be made small at one stroke only by checking the value of φT1q1. Since the same can apply to the second and subsequent principal element when compared with the subsequent principal axis, when consideration is made to the efficiency of refining the candidate, it can be said that the process is appropriate.
The finally acquired nearest neighbor pattern is outputted to the outside of the pattern search apparatus 1.
(Third Embodiment)
A pattern search apparatus 201 of a third embodiment of the invention will be described with reference to
Pattern recognition apparatus 201 of this embodiment is different from pattern search apparatus 1 of the first embodiment and pattern search apparatus 101 of the second embodiment in the following points.
The first different point is that pattern recognition apparatus 201 has previously stored a training pattern set which becomes a retrieval object and makes it a training pattern set. A class corresponding to this training pattern set is stored, a nearest neighbor pattern is searched by a pattern search unit 208 with respect to an input pattern, and a corresponding class outputted by a class unit 209 is outputted as a recognition result.
The second different point is that pattern search unit 208 acquires a cumulative histogram based on a value on a dimensional axis on which a pattern is acquired by a distribution acquisition unit 203, and makes it a cumulative probability distribution.
Pattern recognition apparatus 201 includes a storage unit 202, distribution acquisition unit 203, a hash function unit 204, a training unit 205, a search unit 206, an orthogonal transformation unit 207, pattern search unit 208, and class unit 209.
Incidentally, in the operation of pattern recognition apparatus 201, with respect to a similar process to that of pattern search apparatus 1 of the first embodiment and that of pattern search apparatus 101 of the second embodiment, a description thereof will be omitted.
Distribution acquisition unit 203 acquires a cumulative probability distribution on each axis of a training pattern set stored in storage unit 202. Here, rearrangement is performed by a value on each axis of the training pattern set to obtain a cumulative histogram and the cumulative probability distribution is estimated.
A difference from the distribution expression using the function in the respective embodiments is that it can be uniquely obtained since there is no optimization parameter, however, when the number of the training patterns is small, since a distribution between the training patterns is represented linearly, an error becomes large.
Pattern search unit 208 includes storage unit 202, distribution acquisition unit 203, hash function unit 204, training unit 205, search unit 206, and orthogonal transformation unit 207, and searches the nearest neighbor pattern of the input pattern from the training pattern set stored in storage unit 202.
Class unit 209 registers a class corresponding to each of the patterns in the training pattern set, and outputs, as the class of the input pattern, the class corresponding to the nearest neighbor pattern of the input pattern searched by pattern search unit 208 to the outside of pattern search apparatus 1.
Incidentally, the present invention is not limited to the embodiments as described above, and the components can be modified and embodied within the scope not departing from the gist in an implementation phase.
Besides, various inventions can be formed by appropriate combinations of components disclosed in the embodiments. For example, some components may be deleted from all components disclosed in the embodiment. Further, components of different embodiments may be appropriately combined.
Number | Date | Country | Kind |
---|---|---|---|
2007-183788 | Jul 2007 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
5668897 | Stolfo | Sep 1997 | A |
5802509 | Maeda et al. | Sep 1998 | A |
6236749 | Satonaka et al. | May 2001 | B1 |
6421463 | Poggio et al. | Jul 2002 | B1 |
6690819 | Teraji | Feb 2004 | B1 |
6785672 | Floratos et al. | Aug 2004 | B1 |
7796885 | Dress et al. | Sep 2010 | B2 |
20010026634 | Yamaguchi | Oct 2001 | A1 |
20030128876 | Yamaguchi | Jul 2003 | A1 |
20040001608 | Rhoads | Jan 2004 | A1 |
20040019574 | Meng et al. | Jan 2004 | A1 |
20050120017 | Motoki | Jun 2005 | A1 |
20050283505 | Fujimoto | Dec 2005 | A1 |
20060126906 | Sato et al. | Jun 2006 | A1 |
20060193159 | Tan et al. | Aug 2006 | A1 |
20060253491 | Gokturk et al. | Nov 2006 | A1 |
20070003135 | Mattausch et al. | Jan 2007 | A1 |
20070286272 | Richards et al. | Dec 2007 | A1 |
20080000991 | Yin et al. | Jan 2008 | A1 |
20080059462 | Millett et al. | Mar 2008 | A1 |
20090307184 | Inouye et al. | Dec 2009 | A1 |
Number | Date | Country |
---|---|---|
07-093159 | Apr 1995 | JP |
2006-011492 | Jan 2006 | JP |
Entry |
---|
Indyk, et al., Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality, In Proceedings of the 30th ACM Symposium on Theory of Computing (STOC'98), pp. 604-613, May 1998. |
Datar, et al., Locality-Sensitive Hashing Scheme Based on p-Stable Distributions, In Proceedings of the 20th Annual Symposium on Computational Geometry (SCG2004), Jun. 2004. |
Japanese Office Action for Japanese Application No. 2007-183788 mailed on May 8, 2012. |
Ishibashi et al.; “Heirarchical Clustering Algorithm Using Locality-Sensitive Hashing”. |
Number | Date | Country | |
---|---|---|---|
20090019044 A1 | Jan 2009 | US |