Automated object detection and/or recognition (ODR) is a core facility enabling sophisticated treatment of raw data. Applications are varied, but illustrative examples include detection of physical objects (from simple geometric shapes through to geographic features, vehicles and faces) in raw static images or video, as well as detection of audio objects such as songs or voices in raw audio data. In some cases, detection (i.e., detection and/or recognition) is practically the whole application, in others, it is a small part of a much larger application.
A myriad of techniques have been developed for ODR, each with its advantages and disadvantages. However, a constant theme over time has been a demand for better efficiency as raw data sets grow ever larger. For example, it is desirable to recognize aspects of non-text media available in large public computer networks to facilitate non-text media search functionality, but it is not uncommon for corresponding raw data sets to contain items numbering in the billions. At such scales even small improvements in detection speed and accuracy can have large efficiency impacts, and it is desirable to know when a particular technique has been optimally configured in some respect.
An efficient, effective and at times superior object detection and/or recognition (ODR) function may be built from a set of Bayesian stumps. Bayesian stumps may be constructed for each feature and object class, and the ODR function may be constructed from the subset of Bayesian stumps that minimize Bayesian error for a particular object class. That is, Bayesian error may be utilized as a feature selection measure for the ODR function. Furthermore, Bayesian stumps may be efficiently implemented as lookup tables with entries corresponding to unequal intervals of feature histograms. Interval widths and entry values may be determined so as to minimize Bayesian error, yielding Bayesian stumps that are optimal in this respect.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
The same numbers are used throughout the disclosure and figures to reference like components and features.
In an embodiment of the invention, an efficient object detection and/or recognition (ODR) function is built from a set of Bayesian stumps. A Bayesian stump may be implemented with a lookup table (LUT) having entries corresponding to intervals of a feature histogram. Each Bayesian stump may have a specified (and possibly same) number of entries. The entries may correspond to feature intervals having unequal width. The width of the intervals may be automatically determined. The width of the intervals and the value of corresponding lookup table entries may be set so as to minimize Bayesian error. That is, the Bayesian stump may be constructed so as to be optimal with respect to Bayesian error for a given feature and lookup table.
Such Bayesian stumps may be constructed for each feature and object class. Each such Bayesian stump may be considered, by itself, a weak classifier. An adaptive boosting (“Adaboost”) technique may be utilized to construct a strong classifier, that is, an effective ODR function, based on Bayesian stumps. Such boosting techniques require a way of selecting between weak classifiers and, in an embodiment of the invention, those Bayesian stumps are selected which minimize Bayesian error. That is, Bayesian error may serve as a feature selection measure for the strong classifier, and is at times superior to conventional measures such as Bhattacharyya distance or Jensen-Shannon entropy in terms of both speed and accuracy. As used herein, the term “Bayesian boosting” includes construction of Bayesian stumps as well as construction of ODR functions based on Bayesian stumps in accordance with an embodiment of the invention.
Before describing further details of object detection in accordance with an embodiment to the invention, it will be helpful to have reference to an example computing environment suitable for incorporating such.
The computer 102 may be any suitable computing device. Examples of suitable computing devices include mainframes, minicomputers, desktop computers, personal computers (PCs), workstations, portable computers, laptop computers, tablet computers, personal digital assistants (PDAs), mobile telephones, programmable consumer electronics devices, routers, gateways, switches, hubs, and suitable combinations thereof.
The computer 102 may include one or more processing units capable of executing instructions to perform tasks, as well as one or more types of computer-readable media such as volatile and/or non-volatile memory capable of storing data, computer programs and/or computer program components. Such computer programs and components may include executable instructions, structured data and/or unstructured data organized into modules, routines and/or any suitable programmatic object. Such computer programs and components may be created by and/or incorporate any suitable computer programming language.
The computer 102 may include a wide variety of input/output (IPO) devices not shown in
For clarity, embodiments of the invention may be described herein with reference to symbolic operations such as those of a computer programming language. Such symbolic operations and any data that they act upon correspond to physical states of components and changes in components of computing devices such as the computer 102 in a manner well understood by one of skill in the art. In an embodiment of the invention, each such operation and its associated data may be fully implemented in hardware.
The application 106 may be any suitable application making use of ODR functionality. Although
As will be apparent to one of skill in the art, the Bayesian boosting ODR module 110 need not be entirely incorporated into an application such as the application 106. For example, Bayesian boosting ODR module 110 functionality may be made available to the application 106 utilizing any suitable distributed functionality technique. Similarly, the training data 112 need not be located entirely at the computer 102, but may be suitably distributed, for example, in a network of computers such as the computer 102.
The training data 112 may include a set S of classified examples {(xi, yi)}, i=1, . . . , N, where xi is a unit (e.g., a file or record) of raw data containing a positive or negative example (e.g., presence or absence) of a particular object and/or class of objects, yi is an object classification label corresponding to a classification of the example xi, and N is any suitable number of examples. The object classification labels yi are taken from a set of object classification labels {wc}, c=1, . . . , C, where again, C may be any suitable number. As an illustrative example that will be elaborated below, the raw data xi may be static images, some of which include human faces, the set of object classification labels may be {w1, w2} where w1 corresponds to a classification of “contains a face,” and w2 corresponds to a classification of “does not contain a face,” and the set S is a set of appropriately labeled static images.
The training data 112 may be utilized by the Bayesian boosting ODR module 110 to build Bayesian stumps and an ODR function capable of detecting object classes {wc}.
The Bayesian boosting ODR module 202 may further include a set of object feature extractors 206. The set of object feature extractors 206 may be a set of functions {φj}, j=1, . . . , M, where M is any suitable number, capable of extracting some feature of the raw data, for example, capable of evaluating a feature score for a particular item of raw data. Such feature extractors φj may be linear (e.g., wavelet transforms) or non-linear (e.g., local binary pattern filters). By using the object feature extractors 206 on the training data 112 (
A Bayesian stump generator 212 may construct the feature space histograms 208 from the training data 112 (
From the Bayesian stumps 214 and an adaptive set of classifier weights 216, an ODR function generator 218 may generate one or more ODR functions 220 capable of detecting objects belonging to one or more of the object classes 204 in raw data. For example, a particular one of the ODR functions 220 may be able to detect faces in static images.
Having described structural aspects of the Bayesian boosting ODR module 110 (
At step 302, one or more Bayesian stumps may be instantiated. For example, the Bayesian stump generator 212 (
At step 308, a candidate data set may be acquired. For example, the computer 102 (
The Bayesian stump generator 212 (
At step 402, equal-interval feature space histograms may be instantiated. For example, the Bayesian stump generator 212 (
If the information density of the feature space is non-uniform (as is not infrequently the case) then quantizing the feature space with intervals of equal width is inefficient since a high number of intervals may be required to adequately capture details in regions of high information density and then the overhead to support the additional intervals in regions of low information density will be wasted. Instead intervals of unequal width may be utilized, but then a method of determining suitable interval widths is required. In the example depicted in
Adjacent intervals may be examined for consistency. In an embodiment of the invention, adjacent intervals are considered consistent if an arithmetic difference between two feature histograms 208 (
(f(xr)−g(xr))*(f(xs)−g(xs))≧0
for any xr, xs in the region σ, where, for example, f(x)=p(φj(x), w1) and g(x)=p(φj(x), w2). At step 404, adjacent consistent intervals in a set of feature intervals 210 (
Following one or more iterations of step 404, the number of intervals in the set may be decreased, for example, from the initial number of intervals L to a current number of intervals L′. At step 408, the current number of intervals L′ may be compared to the target number of intervals K. If the current number of intervals L′ is still greater than the target number of intervals K, the procedure may progress to step 410. If the current number of intervals L′ is now less than the target number of intervals K, the procedure may progress to step 412. If the current number of intervals L′ is now equal to the target number of intervals K, the procedure may progress to step 502 of
At step 410, the set of L′ feature intervals 210 (
l*=arg minl|p(l,w1)−p(l,w2)|
where l is any of the set of L′ feature intervals 210 (
In contrast, at step 412, the set of L′ feature intervals 210 (
l*=arg maxl[min(p(l,w1), p(l,w2))]
where, again, l is any of the set of L′ feature intervals 210 (
and k=1, . . . , T. Symbolically:
k*=arg maxk I(k)
Once the desired number of intervals K is achieved, at step 502, a lookup table having K entries may be instantiated for the corresponding Bayesian stump. For example, the Bayesian stump generator 212 (
At step 504, a next interval k of the K intervals may be selected (or else the first of the K intervals if none has previously been selected). At step 506, the corresponding entry of the lookup table instantiated at step 502
may be set so as to minimize Bayesian error. For example, the entry of the lookup table may be set to w, if that choice minimizes Bayesian error (i.e., P(w1|x)), or otherwise w2(if P(w2|x)<P(w1|x)). The lookup table entry may be set to the object class label (e.g., the value of w1 or w2). Alternatively, the lookup table entry may be set to the corresponding log-likelihood of the classification. Symbolically:
At step 508, it may be determined if there are more intervals of the K intervals to be selected. If there are more intervals, the procedure may return to step 504 to select the next interval k. Otherwise, the procedure may progress to steps other than those depicted in
An effective ODR function may be constructed from such Bayesian stumps. For example, the ODR function may be constructed from T such Bayesian stumps.
At step 608, a next feature j (i.e., a next feature extractor φj) may be selected (or else a first such feature if none has been previously selected). At step 610, a Bayesian stump wit may be trained for the feature j and the classifier weights {wit}. For example, the Bayesian stump generator 212 may instantiate a new one of the Bayesian stumps 214 for the feature j, and where the importance of each example in the feature space x′i=φj(xi) may be modified based on the classifier weight wit (e.g., x′i=wit φj(xi)). At step 612, it may be determined if there are more features to be selected. If there are more features to be selected, a procedure incorporating steps depicted in
Steps 608, 610 and 612 may result in the instantiation of M Bayesian stumps, that is, one for each object feature extractor 206 (
At step 704, classifier weights for the next boosting iteration (i.e., iteration t+1) may be updated based on the Bayesian stump selected at step 702. For example, classifier weights wit+1 may be updated according to the formula: wit+1=wit exp(−ht(xi)) for each i=1, . . . , N. At step 706, it may be determined if there are more boosting iterations to complete (i.e., if t<T). If there are more boosting iterations to complete, the procedure may return to step 604. Otherwise, the procedure may progress to step 708.
At step 708, an ODR function 220 (
where H(x) is an ODR function 220 (
All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and/or were set forth in its entirety herein.
The use of the terms “a” and “an” and “the” and similar referents in the specification and in the following claims are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms “having,” “including,” “containing” and similar referents in the specification and in the following claims are to be construed as open-ended terms (e.g., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely indented to serve as a shorthand method of referring individually to each separate value inclusively falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate embodiments of the invention and does not pose a limitation to the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to an embodiment of the invention.
Preferred embodiments of the invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the specification. The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as explicitly described herein. Accordingly, embodiments of the invention include all modifications and equivalents of the subject matter recited in the following claims as permitted by applicable law.