There are many techniques for classification of objects into one of several known object classes. For example, the objects may be moving objects or static objects. Typically, these techniques are parametric and may need large amounts of training data or samples. Some of the parametric techniques include those based on hidden markov models (HMM), support vector machine (SVM) and artificial neural networks (ANN). On the other hand, there exist non-parametric methods like nearest neighbor, but may not be accurate with small amounts of training data. Thus, due to requirement of more number of training samples, the above-mentioned techniques for classification of objects may not be feasible. Further, authoring a new object class may be also cumbersome, as it usually involves re-training entire data.
Various embodiments are described herein with reference to the drawings, wherein:
The drawings described herein are for illustration purposes only and are not intended to limit the scope of the present subject matter in any way.
A system and method for classification of moving objects and user authoring of new object classes is disclosed. In the following detailed description of the embodiments of the present subject matter, reference is made to the accompanying drawings that form a part hereof, and in which are shown by way of illustration specific embodiments in which the present subject matter may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present subject matter, and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the present subject matter. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present subject matter is defined by the appended claims.
In the document, ‘moving object’ refers to a general entity that includes motions of different entities like a continued motion of the left hand followed by a motion of the right hand. A collection of such ‘moving objects’ into which a given test object needs to be classified is referred to as an ‘object class’ in the document. The object class includes variations of the ‘moving objects’.
At step 106, multiple initial candidate library object descriptors are identified from an object library and a motion library using the extracted object descriptor and the extracted motion descriptor. The object library and motion library are formed from given object samples including known object classes. The formation of the object library and the motion library is explained in greater detail in the below description. At step 108, an initial object class estimate is identified based on the identified multiple initial candidate library object descriptors. At step 110, an initial residue is computed based on the extracted object descriptor and the identified multiple initial candidate library object descriptors associated with the initial object class estimate.
At step 112, a set of multiple candidate object descriptors is identified from the object library based on a residue and the identified multiple candidate library object descriptors from a previous iteration. At step 114, scores are computed for each object class based on the identified set of multiple candidate library object descriptors. At step 116, an object class estimate with a highest score is identified. At step 118, a residue is computed based on the extracted object descriptor and the identified candidate library object descriptors associated with the identified object class estimate. At step 120, it is determined whether the identified object class estimates converge based on a stopping criterion. If it is determined so, step 122 is performed, else the method is routed to perform the step 112.
At step 122, the identified object class is declared as an output object class. In one example implementation, if it is determined in step 120 that the identified object class estimates converge based on the stopping criterion, it is determined whether to reject the inputted moving object based on an object rejection criterion. Further, if the inputted object is not to be rejected, step 122 is performed. According to one embodiment of the present subject matter, a method of classification of a static object may be also realized in a similar manner as the method described above. One example of classification of static objects is recognition of logos from printed documents which is explained in detail with respect to
The object library and the motion library may be formed as below. Consider a set of N object classes labeled 1, 2, 3 . . . N. Each of the object classes includes a small set of representative samples. For example, the samples may be a set of short videos of the moving object. Within each sample, a relevant portion is first identified which includes the moving object. This may be done, for example in videos, by identifying a start frame and an end frame using any suitable object detection and segmentation. The identification of the start frame and the end frame removes extraneous data not needed for classification.
Then, an object class library L; is formed for each object class i. The object class library L; includes two sub-libraries, namely object library Lo,i and motion library Lm,i. The object library Lo,i and motion library Lm,i includes object descriptors and motion descriptors, respectively. The object library Lo,i for a given object class i is formed by extracting suitable object descriptors from given samples of the object class i. For example, an object descriptor is extracted from each sample of the object class i and then the object descriptors are concatenated to form the object library Lo,i.
For, example, if the given samples of the object class i are short videos, few frames are selected from the given video samples, and object feature vectors are computed for the selected frames. The frame selection may be performed by sampling to capture enough representative object feature vectors. For example, the object feature vectors may be features describing shape, size, color, temperature, motion, intensity of the object, and the like. The object descriptor is then formed by concatenating the object feature vectors columnwise.
The above process is then repeated for each video sample and the object descriptors from each of the video samples are concatenated to form the object library Lo,i for a given object class i. Mathematically, the object library Lo,i is represented as Lo,i=[Lo,i,1Lo,i,2Lo,i,3 . . . Lo,i,M
The full object Library Lo for the N object classes is obtained by further concatenating the individual object libraries. Thus, Lo=[Lo,1Lo,2Lo,3 . . . Lo,N], where Lo,i denotes the object library for object class i, which is formed as explained above. The number of rows in Lo is F, while the number of columns depends on the total number of samples. Thus, Lo is composed of M1+M2+ . . . +MN object descriptors.
Similarly, the motion library Lm=[Lm,1Lm,2Lm,3 . . . Lm,N], where Lm,i denotes the motion library for object class i. For each object sample, a motion descriptor may be formed for that sample. Then the motion descriptors may be stacked from each of the object samples to form the motion library Lm,i. Thus, Lm,i can be written as Lm,i=[lm,i,1lm,i,2 . . . lm,i,M
At step 206, it is determined whether to reject the authored object class. For example, it may be determined whether the object library and the motion library associated with the authored object class are substantially close to the existing object library and the motion library using an object rejection criterion. If it is determined so, the authored object class is rejected and the user is requested for an alternate object class in step 208. If not, step 210 is performed where the object library and the motion library associated with the authored object class are added to the existing object library and motion library, respectively.
A general computing device 502, in the form of a personal computer or a mobile device may include a processor 504, memory 506, a removable storage 518, and a non-removable storage 520. The computing device 502 additionally includes a bus 514 and a network interface 516. The computing device 502 may include or have access to the computing system environment 500 that includes user input devices 522, output devices 524, and communication connections 526 such as a network interface card or a universal serial bus connection.
The user input devices 522 may be a digitizer screen and a stylus, trackball, keyboard, keypad, mouse, and the like. The output devices 524 may be a display device of the personal computer or the mobile device. The communication connections 526 may include a local area network, a wide area network, and/or other networks.
The memory 506 may include volatile memory 508 and non-volatile memory 510. A variety of computer-readable storage media may be stored in and accessed from the memory elements of the computing device 502, such as the volatile memory 508 and the non-volatile memory 510, the removable storage 518 and the non-removable storage 520. Computer memory elements may include any suitable memory device(s) for storing data and machine-readable instructions, such as read only memory, random access memory, erasable programmable read only memory, electrically erasable programmable read only memory, hard drive, removable media drive for handling compact disks, digital video disks, diskettes, magnetic tape cartridges, memory cards, Memory Sticks™, and the like.
The processor 504, as used herein, means any type of computational circuit, such as, but not limited to, a microprocessor, a microcontroller, a complex instruction set computing microprocessor, a reduced instruction set computing microprocessor, a very long instruction word microprocessor, an explicitly parallel instruction computing microprocessor, a graphics processor, a digital signal processor, or any other type of processing circuit. The processor 504 may also include embedded controllers, such as generic or programmable logic devices or arrays, application specific integrated circuits, single-chip computers, smart cards, and the like.
Embodiments of the present subject matter may be implemented in conjunction with program modules, including functions, procedures, data structures, and application programs, for performing tasks, or defining abstract data types or low-level hardware contexts. Machine-readable instructions stored on any of the above-mentioned storage media may be executable by the processor 504 of the computing device 502. For example, a computer program 512 may include machine-readable instructions capable of classification of moving objects and user authoring of new object classes, according to the teachings and herein described embodiments of the present subject matter. In one embodiment, the computer program 512 may be included on a compact disk-read only memory (CD-ROM) and loaded from the CD-ROM to a hard drive in the non-volatile memory 510. The machine-readable instructions may cause the computing device 502 to encode according to the various embodiments of the present subject matter.
As shown, the computer program 512 includes a moving object classification module 528. For example, the moving object classification module 528 may be in the form of instructions stored on a non-transitory computer-readable storage medium. The non-transitory computer-readable storage medium having the instructions that, when executed by the computing device 502, may cause the computing device 502 to perform the methods described in
In various embodiments, the methods and systems described in
Although the present embodiments have been described with reference to specific examples, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the various embodiments. Furthermore, the various devices, modules, analyzers, generators, and the like described herein may be enabled and operated using hardware circuitry, for example, complementary metal oxide semiconductor based logic circuitry, firmware, software and/or any combination of hardware, firmware, and/or software embodied in a machine readable medium. For example, the various electrical structure and methods may be embodied using transistors, logic gates, and electrical circuits, such as application specific integrated circuit.
indicates data missing or illegible when filed
and then selecting the object descriptor indices of Lo corresponding to the largest values. The corresponding object descriptors stacked together are now denoted as Ll, where we drop the subscript ‘o’ for convenience, and l is used here to denote the appropriate set of indices referred to. Further Ll† denotes the pseudoinverse of Ll. Other suitable realizations of f (Lo, Lm, Lo, lm) may also be possible, including matrix-based computations or using dynamic time warping (DTW) for example.
indicates data missing or illegible when filed
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IN2010/000852 | 12/24/2010 | WO | 00 | 6/17/2013 |