Recently, pattern recognition has received great deal of attention in diverse engineering fields such as oil exploration, biomedical imaging, speaker identification, automated data entry, finger prints recognition, etc. Many valuable contributions have been reported in these fields [see: Chung, K, Kee, S. C. and Kim S. R, “Face recognition using principal component analysis of Gabor filter responses”, Proceedings of 1999. International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-time Systems, pp. 53–57; Leondes, C. T.: Image Processing and Pattern Recognition, (Academic Press, 1998); and. Luo, X. and Mirchandani, G., “An integrated framework for image classification”, Proceedings of 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, Vol. 1, pp. 620–623.].
Face recognition is one of the important research topics in this area which has been receiving the attention of many researchers due to its useful applications, such as system security and human-computer interface [Chellappa, R., Wilson, C. L. and Sirohey, S., “Human and machine recognition of faces”, a survey. Technical Report CAR-TR-731, CS-TR33339, University of Maryland, August 1994.]
In conventional pattern recognition, the task is divided into 2 parts. The first part is obtaining a feature space of reduced dimensions and complexity, and the second part is the classification of that space [Sarlashkar, M. N., Bodruzzaman, M. and Malkani, M. J., “Feature extraction using wavelet transform for neural network based image classification”, Proceedings of the Thirtieth Southeastern Symposium on System Theory, 1998, pp. 412–416.].
Neural Networks (NN) have been employed and compared to conventional classifiers for a number of classification problems. The results have shown that the accuracy of the NN approach is equivalent to, or slightly better than, other methods. Also, due to the simplicity and generality of the NN, it leads to classifiers that are more efficient [Zhou, W., “Verification of the nonparametric characteristics of back propagation neural networks for image classification”, IEEE Transactions on Geoscience and Remote Sensing, March 1999, Vol. 37, No. 2 pp. 771–779]. As reported in the literature, NN classifiers possess unique characteristics, some of which are:
The NN learning is generally classified as supervised or unsupervised. Supervised methods have yielded higher accuracy than unsupervised ones, but suffer from the need for human interaction to determine classes and training regions. In contrast, unsupervised methods determine classes automatically, but generally show limited ability to accurately divide terrain into natural classes [Hara, Y., Atkins, R. G., Yueh, S. H., Shin, R. T., and Kong, J. A., “Application of Neural Networks to Radar Image Classification”, IEEE Trans. on Geoscience and Remote Sensing, January 1994, Vol. 32, No. 1, pp. 100–109 and Herman, P. D., and Khazenie, N., “Classification of multispectral remote sensing data using a back-propagation neural network”, IEEE Transactions on Geoscience and Remote Sensing, January 1992, Vol. 30, pp. 81–88.].
In the field of pattern recognition, the combination of an ensemble of neural networks has been to achieve image classification systems with higher performance in comparison with the best performance achievable employing a single neural network. This has been verified experimentally in the literature [Kittler, J., Hatef, M., Duin, R. P. W. and Matas, J., “On combining classifiers”, IEEE Transaction on Pattern Anaysis and Machine Intelligence, March 1998, Vol. 20, pp. 226–239.]. Also, it has been shown that additional advantages are provided by a neural network ensemble in the context of image classification applications. For example, the combination of neural networks can be used as a “data fusion” mechanism where different NN's process data from different sources [Luo, X. and Mirchandani, G., “An integrated framework for image classification”, Proceedings of 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, Vol. 1, pp. 620–613.]. A number of image classification systems based on the combination of the outputs of a set of different classifiers has been. Different structures for combining classifiers can be grouped as follows [see Lu, Y. above and Ho, T. K., Hull, J. J. and Srihari, S. N., “Decision Combination in Multiple Classifier Systems”, IEEE Trans. on Pattern Analysis Machine Intelligence, January 1994, Vol. 16, No. 1, pp. 66–75.]:
For the parallel structure, the classifiers are used in parallel and their outputs are combined. In the pipeline structure, the system classifiers are connected in cascade. The hierarchical structure is a combination of the structures in i & ii above.
The combination methods in the literature are based on voting rules, statistical techniques, belief functions and other classifier fusion schemes [Xu, L., Krzyzak, A. and Suen, C. Y., “Methods for combining multiple classifiers and their applications to handwriting recognition”, IEEE Trans. On Systems, Man and Cyb.22, May–June, 1992, Vol. 22, pp. 418–435., Prampero, P. S., and de Carvalho, A. C, “Recognition of Vehicles Silhouette using Combination of Classifiers”, International Joint Conference on Neural Networks, (IJCNN'98), 1998, pp. 1723–172613].
Another approach to pattern recognition is shown in U.S. Pat. No. 5,175,775 to Iwaki, et al. In its disclosure a vast number of reference images are grouped into initial groups each containing a limited number of the individual reference images in the first step. Then a most associated reference image having a maximum correlation coefficient is discriminated for each of the initial groups. Next in the second step, all of the thus obtained most-associated reference images are regrouped into new groups each having similarly a limited number of reference images. The number of new groups is accordingly small than that of the initial groups. The new groups are again subjected to the correlation operation to enable next regrouping. Lastly, in the third step, the number of groups is reduced to a single final group which contains less than the limited number of the reference images. The final group is subjected to the correlation operation with respect to the object image to thereby discriminate a particular reference image exactly corresponding to the object image among the vast number of individual reference.
It is very important to reduce the complexity, reduction of computation time and increase the fidelity of systems for pattern recognition.
It is a primary objective of the present invention to design a high fidelity pattern recognizer.
Another object of this invention is to provide a recognition algorithm that continuously enhances itself using all the information available up to that point.
A further object of this invention is to provide an evolutionary learning environment employing a recognition algorithm.
Preferred embodiments of the invention include self-designing intelligent signal processing system comprising: means for receiving signals; and, adaptive means for recognizing a pattern from the received signals, wherein the adaptive means are constantly updated over time based on the receiving signals and the method of carrying out said system comprising the steps of: receiving signals from a source; recognizing a pattern from the received signals; and adaptively updating the recognizing step with the received signals to enhance the signal processing method over time.
Before explaining the disclosed embodiments of the present invention in detail, it is to be understood that the invention is not limited in its application to the details of the particular arrangement shown since the invention is capable of other embodiments. Also, the terminology used herein is for the purpose of description and not of limitation.
It would be useful to discuss the meanings of some words used herein and their applications before discussing the novel self-designing intelligent signal processing system capable of evolutional learning for classification/recognition of one and multidimensional signals including:
Back propogation—A learning algorithm that is used to train the neural network to obtain a particular output when a certain input is introduced to the neural network which training is similar to that system used in U.S. Pat. No. 6,418,378 which is incorporated herein by reference hereto;
According to this invention, the above objects are achieved by an evolutionary recognition classifier. The advantages and features of this inventive pattern recognition technique are realized by a design which has evolutionary learning by developing the features and selecting the criteria that are best suited for the recognition problem under consideration. The evolutionary learning is similar to one used in U.S. Pat. No. 6,263,325 B1 which is incorporated herein by reference thereto. This technique is capable of recognizing an enormously large number of patterns by virtue of the fact that it basically analyzes the signal in different domains and explores the distinguishing characteristics in each of these domains. In other words, this approach uses available information and extracts more characteristics for classification purposes by projecting the signal in different domains. A large number of classification criteria can be developed to greatly enhance the performance of the classifier by exploiting: (a) Structure type; and, (b) Criteria selection and formulation from the information in the different domains.
In the following section (Section 2), a particular structure illustrates the inventive technique, namely, a parallel implementation approach using the novel Multicriteria Multitransform Neural Network (MCMTNN) classifier for image classification. Then in (Section 3) the experimental results of the parallel MCMTNN classifier is compared with the implementation and experimental results employing a Single Transform Neural Network (STNN) classifier. This comparison between the MCMTNN and the STNN classifiers confirms the improved performance of the classifier of the invention. Also, in (Section 3) experimental results are given demonstrating the improved classification/recognition performance, in the presence of additive noise, of the inventive MCMTNN classifier. Finally, conclusions are presented in (Section 4).
Section 2. Multicriteria Multitransform Neural Network (MCMT): A Parallel Implementation
In this implementation, shown in
A potentially successful criterion i, with its selected values of the parameters, in a particular domain, clusters the N input signals in a number of distinct non-overlapping clusters. The cluster index, according to the ith criterion, is denoted c1, where c1=1, 2, 3, . . . , C1,. Corresponding to a number n of selected criteria, i takes the values 1, 2, . . . , or n. It is worthwhile to note that more than one criterion can be derived from a given domain. Also, Ci is in general, different for the different i's.
The NN Classifier learning continues, by testing all the criteria presented over the parameters range for each criterion, until a successful set of criteria is obtained. A successful classifier using n criteria, i.e. decision, should yield a unique composite index (c1c2c3 . . . cn) corresponding to each of the N input signals.
Also, it is easy to show that
c1=1, 2, . . . C1, c2=1, 2, . . . C2, cn=1, 2, . . . Cn
n=n1+n2+n3++nD
where D is the number of transform domains, and nk is the number of criteria in the kth domain, (k=1, 2, . . . D).
Section 3. Experimental Results
Sample experimental results are given to illustrate the performance of the technique. In this example, thirty-one, 8 bit gray level, facial images are downloaded from the Internet as shown in
3-i MCMTNN Classifier
A resulting successful structure is shown in
Reference should now be made to
The inventive technique, when presented with a given classification task, can yield more than one classifier. For example, in this experiment another classifier, denoted Classifier 2, is obtained which yields 100% classification accuracy. It uses the same structure as Classifier 1 except the third NN, where the learning results in a criterion that employs the sum of the second ten largest singular values instead of the first ten largest singular values. The performance of different classifiers can be evaluated and/or the redundancy can be used to devise a voting scheme to enhance the accuracy of classification of incomplete or corrupted data.
Alternatively, a design criterion is incorporated in the design of the classifiers such that the fused data from the different classifiers yields acceptable performance under nonideal conditions, i.e. much distortion, corrupted signals, noise, etc. An example of the non-ideal conditions is given in the following section, 3-iii.
3-ii MCMTNN Classifier of Corrupted Images
White noise up to 10 gray levels is added to each facial image shown in
3-ii-a Classification of noisy images by Classifier 1
From the thirty-one images in
3-ii-b Classification of Noisy Images by Classifier 2:
From the input thirty-one images in
It is significant to note that Classifier 2 is designed such that the images that Classifier 1 fails to recognize are recognized successfully by Classifier 2, and vice versa, when an additional deciding appropriate criterion is introduced. This is illustrated in the following Section 3-ii-c:
3-ii-c Classification enhancement by combining Classifier 1 and Classifier 2:
By using a simple detector and another feature of the image, energy in this example, it is determined which of the two results, from classifiers 1 and 2, when they disagree, is correct. This additional voting step resulted in recognizing all of the 31 images successfully.
3-iii Single Transform Neural Network (STNN) Classifier:
This structure uses one NN and one transform as shown in
In the DCT domain, six images out of the thirty-one are recognized successfully and the criterion selected by the STNN is the sum of the 4×4 low frequency components;
The pattern recognition technique is capable of recognizing a large number of patterns by analyzing the projected patterns (signals) in different domains, explores the distinguishing characteristics, and formulates the corresponding criteria, in each of these domains. The optimum set of criteria is selected by the classifier to satisfy certain constraints. According to these criteria, at each node in each stage, the signals presented are binary clustered, i.e., divided into two groups, according to an appropriately selected criterion. The members of the same group are assigned an index of 0 or 1. Similarly, the signals of each group are binary clustered repeatedly. The process is continued until each signal, or group of signals, is identified by a unique composite index.
A useful structure and the corresponding classification algorithm are hereafter described. Sample experimental results to illustrate the performance of the pattern recognition system are also given. The recognition process can be considered as a special case of classification, at the classifier output, each classified group contains only one signal. In the work presented here, the classification and recognition notation are used interchangeably. Also, the patterns to be classified are treated as signals and frequently referred to as such.
In this Section, a useful multistage structure, with binary classification at each node in each stage, is described, the algorithm employed is presented and the algorithm's evolutionary learning is also shown.
As will become clear in the following sections, for unique classification of N patterns, the number of Binary Classification Units (BCU's) needed equals N−1, irrespective of the number in each subgroup after each BCU. The number of stages each pattern has to go through to be identified depends on the subgrouping strategy. Two extreme situations are given in the following sections.
In
The optimum classification structure is selected according to the nature of the problem under consideration. The case of unknown probability of occurrence of signals is examined. The algorithm is presented. Also, simple implementation examples are given to illustrate the great performance of this technique.
A. The Binary Classifier Unit (BCU)
At each node, two possible BCU designs are described. The first design, BCU-1 of
1) BCU-1:
In the structure shown in
gwJ=S1j
gw(j+1)+Sj−s1J=SJ+1
where the subscript wJ and wJ+1 are binary words equal to j and j+1, respectively, Referring to the multistage structure in
The N signals are extracted in a descending order of their probability. Thus, a signal with higher probability is classified with fewer classification stages compared with a signal with lower probability. This, in general, is expected to lead to an overall reduction in the average time and amount of computations required for recognition.
2) BCU-2:
The BCU used at each node, BCU-2, groups the input patterns in two groups, each containing the same number of patterns. The corresponding structure is shown in
B. MSB Structure and Algorithm
The structure is shown in
1. The MSB system in the Learning Mode:
Referring to
The MSB System in the Running Mode.
Referring to
Two simple examples are given to illustrate the performance of the technique. The results obtained using the MSB technique in case of noisy or corrupted data are compared to those obtained using Single Transform Neural Network, STNN, as well as Multi-Input Neural Networks.
A. First Example
In this example, eight, 8 bit gray level, facial images are downloaded from the Internet (Olevitti Research Laboratory ORL), as shown in
1. Results using the MSB Classifier:
The technique is employed for the images of
This structure uses one NN and one transform as in
3. Multi-Input Neural Network classifier.
A Multi-Input Neural Network has been trained to recognize the 8 facial images. Different combinations of 3 criteria have been used in training the Multi-Input Neural Network with different mean square error (MSE). In the best result obtained, 4 images out of 8 images were recognized correctly.
In order to facilitate a better understanding of the invention, it is useful to discuss the relation between:
There are different techniques that could be used in clustering. The signals could be grouped into 4 groups as shown in
In order to fully understand the invention described herein, it would be useful to describe the best mode of carrying it out by referring again to
In the implementation of
Referring specifically to
In the first branch of
For 2-D (2 dimensions):
The result is a certain matrix, which represents the specified image in the DCT. This matrix has low frequency coefficients and the high frequency coefficients and some frequencies in between. The criterion 35 used in this system is the sum of the 4×4 spatial low frequency components in the DCT domain 30 for each image. Each image will be represented by a certain component according to the DCT. The component representing each image will be introduced to the neural network 36. The neural network 36 will group the 31 images into 4 groups according to the values of these components. Each image will have a group number as shown in
In the second branch of
The second criterion 37 is the sum of the 4×4 low sequency HAAR coefficients for each image. The second NN 38 will also group images into 4 groups.
In the third branch of
The result is that each of the digitized images has three 3 group numbers resulting from the groupings by NN 36, NN 38 and NN 40. The result is that the clusters C1, C2 and C3 of
Each image is represented by a composite index representing the group number to which it belongs to with respect to each criterion. This is a unique index for each image. It differs from one image to the other. The final result has been represented by a simple example. These results come out from the learning as different criteria have tried for this problem and the criteria that give the best result in case of noisy images have been selected.
It is a feature of this invention that its advantage is in the evolutionary learning that the value of the invention is realized. Refer now to
Reference should now be made to
The pattern recognition system will be tested by noisy images and the selected criteria will be evaluated. In other words, the selected criteria will be tested to find out if they can still classify signals successfully when they are corrupted with noise. The Criteria Evaluation 144 is made and for those of low classification accuracy the signals will be returned for criteria extraction at 140 with other criteria.
If the criteria give high classification accuracy in case of noisy data, they will be retained. In case of non-accurate classification, other criteria will be tested until the optimum criteria are selected which give the highest classification accuracy incase of noisy signals.
A novel MCMTNN for one and multidimensional signal classification using multicriteria processing and data fusion has been described. The multicriteria are extracted from the projections of the signals, to be classified, in multiple transform domains. The implementation example demonstrates the utility of the MCMTNN classifier for image classification. It employs NN's in parallel and three classification criteria obtained from the image projections in three transform domains. These results are compared with a traditional STNN classifier. The comparison between the MCMTNN and the STNN classifiers illustrates the improved performance of the MCMTNN classifier in terms of appreciable reduction in the overall computational complexity and increased speed.
Additional results, some of which are given in Section 3, confirm the superior performance of the MCMTNN classifier relative to the STNN approach for image classification in the presence of additive noise. Since the inventive MCMTNN classifier is capable of evolutionary learning by selecting the criteria as well as optimizing each criterion for best overall performance, it yields enhanced performance and classification accuracy for different classification environments such as noisy and incomplete data.
Another structure was discussed in Section 4. A Multi-Stage Binary pattern classification/recognition system (MSB) is presented to classify N patterns (or groups), where N=2r. Two cases are presented. In the first case, no apriori information regarding the statistical properties of occurrence of the patterns is available. The MSB structure used (N−1) Binary Classification units. The number of stages required to identify an unknown pattern (or group) is equal to r. In the second case, the probability of occurrence of signals differs from one signal to the other. Signals are extracted in a descending order of their probability. The system is capable of evolutional learning to extract and optimize the classification criteria employed by the Binary Classification Units.
While the invention has been described, disclosed, illustrated and shown in various terms of certain embodiments or modifications which it has presumed in practice, the scope of the invention is not intended to be, nor should it be deemed to be, limited thereby and such other modifications or embodiments as may be suggested by the teachings herein are particularly reserved especially as they fall within the breadth and scope of the claims here appended.
This invention relates to a one and multidimensional signal classification neural network system that employs a set of criteria extracted from the signal representation in different transform domains [denoted the Multicriteria Multitransform Neural Network (MCMTNN)] classifier and more particularly to the signal projection, in each appropriately selected transform domain which reveals unique signal characteristics whereby the criteria in the different domains are properly formulated and their parameters adapted to obtain classification with desirable implementation properties such as speed and accuracy and claims the benefit of U.S. Provisional Application No. 60/315,420 filed Aug. 28, 2001.
Number | Name | Date | Kind |
---|---|---|---|
4254399 | Burkhardt et al. | Mar 1981 | A |
5175775 | Iwaki et al. | Dec 1992 | A |
5432906 | Newman et al. | Jul 1995 | A |
5459636 | Gee et al. | Oct 1995 | A |
5497430 | Sadovnik et al. | Mar 1996 | A |
5537669 | Evans et al. | Jul 1996 | A |
5550928 | Lu et al. | Aug 1996 | A |
5640468 | Hsu | Jun 1997 | A |
5680481 | Prasad et al. | Oct 1997 | A |
5696849 | Blackham | Dec 1997 | A |
5781650 | Lobo et al. | Jul 1998 | A |
5805721 | Vuylsteke et al. | Sep 1998 | A |
5835616 | Lobo et al. | Nov 1998 | A |
5864630 | Cosatto et al. | Jan 1999 | A |
6011865 | Fujisaki et al. | Jan 2000 | A |
6816611 | Hagiwara et al. | Nov 2004 | B1 |
20020069218 | Sull et al. | Jun 2002 | A1 |
Number | Date | Country | |
---|---|---|---|
60315420 | Aug 2001 | US |